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Compounds Having Affinity for the Gramjlocyte-Colony 
Stimulating Factor Receptor f G-CSFRl 

5 Technical Field 

The present invention relates generally to novel compounds that have affinity for 
the granulocyte-colony stimulating factor receptor (G-CSFR). More particularly, the 
invention relates to such compounds which act as G-CSF mimetics by activating or 
inactivating the G-CSFR, or by affecting ligand binding to G-CSFR. The invention 
1 0 additionally relates to methods of using the novel compounds and pharmaceutical 

compositions containing a compound of the invention as the active agent. The invention 
has application in the fields of biochemistry and medicinal chemistry and particularly 
provides G-CSF mimetics for use in the treatment of human disease. 

15 Background Art 

Granulocyte-colony stimulating factor (G-CSF) is a hematopoietic growth factor 
that specifically stimulates proliferation and differentiation of cells of the neutrophilic 
lineage. 

G-CSF is a cytokine that binds to and activates the granulocyte-colony 
20 stimulating factor receptor (G-CSFR). G-CSFR is expressed on the surface of mature 
neutrophils and cells committed to the neutrophilic lineage, with receptor density varying 
from 190 to more than 1400 sites per cell. The receptor is a member of the cytokine 
receptor superfamily; it contains a cytokine receptor-homologous domain responsible for 
G-CSF binding, an immunoglobulin-like domain, three fibronectin type III domains, a 
25 transmembrane region, and an intracellular domain. The observed affinity of G-CSF for 
its receptor is about 100 pM. 

The complete G-CSF protein has become an important therapeutic agent in 
clinical indications involving depressed neutrophil counts. Such indications include 
chemotherapy-induced neutropenia, ADDS and community acquired pneumonia. 
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Furthermore, G-CSF antagonists may be useful in the treatment of some diseases caused 
by an inappropriate or undesirable activation of G-CSFR. 

There remains a need, however, for compounds that bind specifically to G- 
CSFR, both for studies of the important biological activities mediated by the receptor and 
5 for treatment of diseases, disorders and conditions that would benefit from activating or 
inactivating G-CSFR. The present invention provides such compounds, and also provides 
pharmaceutical compositions and methods for using the compounds as therapeutic agents. 

Disclosure of the Invention 

10 In one embodiment, the invention provides compounds comprising a peptide 

chain that binds to G-CSFR. In one aspect, the peptide chain is approximately 10 to 40 
amino acids in length and contains a sequence of amino acids of formula (I) 

(I) CX,X a X 3 3QX 5 X 6 3C,X B C (SEQ ID NO: 1) 

wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X! 

15 is A, N, S, F, D, G, L, T, E, V, P, Q, H, M or K; X 2 is M, G, R, H, D, I, V, A, S, E, N, F, 
Y,P,C,WorT;X 3 isE 3 V,W,F,M 5 A,N,S,L 5 T,Y,GorP;X 4 isV,I,G,Q,W,M,T, 
Y, L, P, D, C, E or A; X 5 is M, E, W, L, P, N, I, T, V, F, Y, Q, S, R, W, G, H or D; X, is 
H, A, W, Y, V, F, Q, M, N, E, S, D, P or G; X 7 is M, F, Y, V, N, L, H, D, S, W, G, Q, C or 
T;andX 8 isC,Y,R,I,K, W,L,E,M,H, A,T,F,D,P,GorQ. 

20 In another aspect, the peptide chain is approximately 9 to 40 amino acids in 

length and contains a sequence of amino acids of formula (II) 

(II) X^X^SGWVWX^ (SEQ ID NO: 2) 

wherein each amino acid is indicated by the standard one-letter abbreviation, and wherein 
X\ is S, Q, R, L or Y; X\ isN, S, T, A or D; X\ is E, D orN; and X ! 4 is L V, T, P or H. 

25 In another aspect, the peptide chain is 6 to 40 amino acids in length and contains 

a sequence of amino acids of formula (III) 

Gil) ERX^X n 2 X n 3 C (SEQ ID NO: 3) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X n , 
is D, L, S, G, E, A, K or Y; X n 2 is W, Y, F, L or V; and X !I 3 is F, G, M or L. 

30 In still another aspect, the peptide chain is approximately 9 to 40 amino acids in 

length and contains a sequence of amino acids of formula (IV) 
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(IV) X III 1 MVYX m 2 X m 3 PX m 4 W (SEQ ID NO: 4) 

wherein each amino acid in indicated by standard one-letter abbreviation, and wherein X m l 
is D or E; X ffl 2 is A or T; X m 3 is Y or V; and X in 4 is P or Y. 

In an additional aspect, the invention provides compounds comprising a peptide 
5 chain approximately 12 to 40 amino acids in length and contains a sequence of amino 
acids of formula (V) 

(V) CX^jX'^X^X^X^X^gX^ioC (SEQ ID NO: 5) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X™ 
is E, G, P, N, R, T, W, S, L, H, A, Q or Y; X IV 2 is S, T, E, A, D, G, W, P, L, N, V, Y, R or 

10 M; X 1 ^ is R, Y, V, Q, E, T, L, P, S, K, M, A or W; X IV 4 is L, M, G, F, W, R, S, V, P, A, D, 
C or T; X^ is V, T, A, R, S, L, W, C, I, E, P, H, F, D or Q; X^ is E, Y, G, T, Q, M, S, N, 
A or P; X™ is C, V, D, G, L, W, E, V, I, S, M or A; X Iv g is S, Y, A, W, P, V, L, Q, G, K, 
F, I, E or D; X ,v 9 is R, W, M, D, H, V, G, A, Q, L, S, E or Y; X™,,, is M, L, I, S, V, P, W, 
F, T, Y, R, or Q. 

1 5 In another aspect the peptide chain is approximately 9 to 40 amino acids in 

length and contains a sequence of amino acids of formula (VI) 

(VI) X v 1 X v 2 X v 3 X v 4 X v 5 X v 6 CX v 7 X v 8 (SEQIDNO:6) 

wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X v , 

is E, C, Q, V, or Y; X v 2 is E, A, L, M, S, W, or Q; X v 3 is K, R or T; X v 4 is L, A, or V; X v 5 
20 is R, A, M, H, E, V, L, G, D, Q, or S; X y 6 is E or V; X v 7 is A or G; X v 8 is R, H, G or L. 

In a further aspect, the peptide chain is approximately 10 to 40 amino acids in 

length that binds to G-CSFR and contains a sequence of amino acids of formula (VH) 
(VH) X^jX^X^XV^EXVV^gX^^SEQIDNO:?) 

wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X^ 
25 is A, E or G; X 71 , is E, H or D; X* is R or G; X^ is K, Y, M, N, Q, R, D, I, S or E; X^ 

is A, S or P; X VI S is E, D, T, Q, K or A: X^ is R, W, K, L, S, A or Q; X* 1 , is R or E; and 

XVsW,G,orR. 

In a final aspect, the invention also provides peptides that, while not necessarily 
corresponding to one of the above-defined formulas, bind to G-CSFR. 
30 In some contexts, the compounds of the invention are preferably in the form of a 

dimer. It is also preferred, in some contexts, that the compounds of the invention include a 
peptide wherein the N-terminus of the peptide is coupled to a polyethylene glycol 
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molecule. In some contexts, it is preferred that the compounds of the invention include a 
peptide wherein the N-terminus of the peptide is acetylated. In addition, it is preferred, in 
some contexts, that the compounds of the invention include a peptide wherein the C- 
terminus of the peptide is amidated. 

The invention also provides a pharmaceutical composition that comprises a 
therapeutically effective amount of a compound of the invention in combination with a 
pharmaceutical^ acceptable carrier, as well as a method for treating a patient who would 
benefit from a G-CSFR modulator, the method comprising administering to the patient a 
therapeutically effective amount of a compound of the present invention. 



Brief Description of the Figures 

Figures 1-1, 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, and 1-11 provide the 
sequences of representative peptide chains contained within the compounds of the 
invention. 

15 Figures 2, 3, 4, 5, 6, 7, 8, 9A, 9B, 10A, 10B, and 1 1 are graphs showing the 

results of various assays described in Examples. 

Detailed Description of the Invention 



20 I. Definitions and Overview 

It is to be understood that unless otherwise indicated, this invention is not 
limited to specific peptide sequences, molecular structures, pharmaceutical compositions, 
or the like, as such may vary. It is also to be understood that the terminology used herein 
is for the purpose of describing particular embodiments only and is not intended to be 

25 limiting. 

It must be noted that, as used in the specification and the appended claims, the 
singular forms "a," "an" and "the" include plural referents unless the context clearly 
dictates otherwise. Thus, for example, reference to "a novel compound" in a 
pharmaceutical composition means that more than one of the novel compounds can be 
30 present in the composition, reference to "a pharmaceutically acceptable carrier" includes 
combinations of such carriers, and the like. 
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In this specification and in the claims that follow, reference will be made to a 
number of terms which shall be defined to have the following meanings: 

Amino acid residues in peptides are abbreviated as follows: Phenylalanine 
is Phe or F; Leucine is Leu or L; Isoleucine is He or I; Methionine is Met or M; Valine is 
5 Val or V; Serine is Ser or S; Proline is Pro or P; Threonine is Thr or T; Alanine is Ala or 
A; Tyrosine is Tyr or Y; Histidine is His or H; Glutamine is Gin or Q; Asparagine is Asn 
orN; Lysine is Lys or K; Aspartic Acid is Asp or D; Glutamic Acid is Glu or E; Cysteine 
is Cys or C; Tryptophan is Tip or W; Arginine is Arg or R; and Glycine is Gly or G. In 
addition, "l-NaT is used to refer to 1-naphthylalanine, the "2-Nal" is used to refer to 
10 2-naphthylalanine. 

Stereoisomers (e.g., D-amino acids) of the twenty conventional amino acids, 
unnatural amino acids such as a,CC-disubstituted amino acids, N-alkyl amino acids, lactic 
acid, and other unconventional amino acids may also be suitable components for 
compounds of the present invention. Examples of unconventional amino acids include: 
1 5 P-alanine, 1 -naphthylalanine, 2-naphthylalanine, 3-pyridylalanine, 4-hydroxyproline, 
O-phosphoserine, N-acetylserine, N-formylmethionine, 3-methylhistidine, 
5-hydroxylysine, nor-leucine, and other similar amino acids and imino acids (e.g., 
4-hydroxyproline). 

"Peptide" or "polypeptide" refers to a polymer in which the monomers are alpha 
20 amino acids joined together through amide bonds. Peptides are two or often more amino 
acid monomers long. One or more of the peptide chains disclosed herein may appear in 
the compounds of the present It is also contemplated that the peptide chains disclosed 
herein represent only a portion of the overall peptide included in the compound. 

The term "dimer" as in a peptide "dimer" refers to a compound in which two 
25 peptide chains are linked; generally, although not necessarily, the two peptide chains will 
be identical and are linked through a linking moiety covalently bound to the carboxyl 
terminus of each chain. 

The term "agonist" is used herein to refer to a ligand that binds to a receptor and 
activates the receptor. 

30 The term "antagonist" is used herein to refer to a ligand that binds to a receptor 

without activating the receptor. Antagonists are either competitive antagonists or 
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noncompetitive antagonists. A "competitive antagonist" blocks the receptor site that is 
specific for the agonist. A "noncompetitive antagonist" inactivates or otherwise affects the 
functioning of the receptor by interacting with a site other than the agonist binding site. 

The term "modulator" as in a "G-CSFR-modulator" refers to a compound that is 
5 either an agonist or antagonist of the G-CSFR. 

"Pharmaceutical^ or therapeutically effective dose or amount" refers to a dosage 
level sufficient to induce a desired biological result. That result can be alleviation of the 
signs, symptoms, or causes of a disease, or any other desired alteration of a biological 
system. Preferably, this dose or amount will be sufficient to either at least partially 
10 activate or at least partially inactivate G-CSFR and, thus, alleviate the symptoms 
associated with an undesired neutrophil count in vivo. 

An "optimal neutrophil count" refers to a quantity of neutrophils in a patient that 
is determined by a clinician to be optimal for that patient in light of the patient's disease 
state, condition, etc. 

15 An "undesired neutrophil count" refers to a quantity of neutrophils in a patient 

that is determined by a clinician to be not optimal for that patient in light of the patient's 
disease state, condition, etc. Thus, an undesired neutrophil count may be depressed, 
elevated or even equal to the expected neutrophil count so long as the clinician determines 
that the actual count is not optimal for the patient. The compounds of the present 

20 1 invention are intended to, inter alia, provide the clinician with compounds that, when 
administered to a patient, bring that patient's neutrophil count closer to an optimal 
neutrophil count. 

The term "treat" as in "treat a disease" is intended to include any means of 
treating a disease in a mammal, including (1) preventing the disease, i.e., avoiding any 
25 clinical symptoms of the disease, (2) inhibiting the disease, that is, arresting the 

development or progression of clinical symptoms, and/or (3) relieving the disease, i.e., 
causing regression of clinical symptoms. 

"Optional" or "optionally" means that the subsequently described circumstance 
may or may not occur, so that the description includes instances where the circumstance 
30 occurs and instances where it does not. 
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By "phannaceutically acceptable carrier" is meant a material which is not 
biologically or otherwise undesirable, i.e., the material may be administered to an 
individual along with the selected active agent without causing any undesirable biological 
effects or interacting in a deleterious manner with any of the other components of the phar- 
5 maceutical composition in which it is contained. 

II. The Novel Compounds 

A. Compounds of Formula (I): 
10 In a first embodiment, the invention provides compounds comprising a peptide 

chain that binds to G-CSFR, wherein the compounds comprise a peptide chain 
approximately 10 to 40 amino acids in length that binds to G-CSFR and contains a 
sequence of amino acids of formula (I) 

(I) CX 1 X 2 X 3 X 4 X S X 6 X 7 X 8 C (SEQ ID NO: 1) 
1 5 wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X, 
is A, N, S, F, D, G, L, T, E, V, P, Q, H, M or K; X 2 is M, G, R, H, D, I, V, A, S, E, N, F, 
Y, P, C, W or T; X 3 is E, V, W, F, M, A,N, S, L, T, Y, G or P; X 4 is V, I, G, Q, W, M, T, 
Y, L, P,D,C,EorA;X5isM,E,W,L,P,N,I,T,V,F, Y, Q, S, R, W, G, H or D; Xg is 
H, A, W, Y, V, F, Q, M, N, E, S, D, P or G; X 7 is M, F, Y, V, N, L, H, D, S, W, G, Q, C or 
20 T;andX 8 isC,Y,R,I,K,W,L,E,M,H,A,T,F,D,P,GorQ. 

Preferably X, is D or P; X 2 is D or P; X 3 is E or W; ^ is V, I or Y; X5 is M or L; 
Xg is W, Y or F; X 7 is M, Y or D; and X s is C or M. 

Examples of particularly preferred sequences satisfying formula (I) include, but are not 
limited to, the following: 
25 CAGEVMHMCC (SEQ ID NO: 8); 

CNREIEAMCC (SEQ ID NO: 9); 

CADEVMHFCC (SEQ ID NO: 10); 

CNREIMWMCC (SEQ ID NO: 1 1); 

CSHEVWWYCC (SEQ ID NO: 12); 
30 CSREVLYYCC (SEQ ID NO: 13); 

CFIEGPWVCC (SEQ ID NO: 14); 

CFVEGNWYCC (SEQ ID NO: 15); 
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CAAEVMVNCC (SEQ ID NO: 16); 
CSDEVIFYCC (SEQ ED NO: 17); 
CDREIMWFCC (SEQ ID NO: 18); 
CAHEVMWMCC (SEQ ID NO: 19); 
CGSEVTFMCC (SEQ ID NO: 20); 
CLEEIMWLCC (SEQ ID NO: 21); 
C ARE VL AMCC (SEQ ID NO: 22); 
CSVEVMQMCC (SEQ ID NO: 23); 
CTNVQLMHYC (SEQ ID NO: 24); 
CDVWQLFDRC (SEQ ID NO: 25); 
CSFVQLNSIC (SEQ ID NO: 26); 
CDYWQWFDKC (SEQ ID NO: 27); 
CESFWVELWC (SEQ ID NO: 28); 
CVPWMFYDLC (SEQ ID NO: 29); 
CDPWMFYDLC (SEQ ID NO: 30); 
CDPWVLFDEC (SEQ ID NO: 31); 
CDHWTYFDMC (SEQ ID NO: 32); 
CVVWTLYDKC (SEQ ID NO: 33); 
CPDWYQSYMC (SEQ ID NO: 34); 
CPDWYSYYMC (SEQ ID NO: 35); 
CPEWYTDVMC (SEQ ID NO: 36); 
CPDWYLDYMC (SEQ ID NO: 37); 
CPEWYLDYMC (SEQ ID NO: 38); 
CPDWYLPYMC (SEQ ID NO: 39); 
CPEWYLPYMC (SEQ ID NO: 40); 
CQDWWVELWC (SEQ ID NO: 41); 
CPDWYLPWMC (SEQ ID NO: 42); 
CACMLRWHC (SEQ ID NO: 43); 
CQRAGYMLAC (SEQ ID NO: 44); 
CHANPVWGEC (SEQ ID NO: 45); 
CFWSDWGQTC (SEQ ID NO: 46); 
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. CPHWTSYYMC (SEQ ID NO: 47); 
CETLCGACFC (SEQ ID NO: 48); 
CATTINDTLC (SEQ ED NO: 49); 
CLNYPHPVFC (SEQ ID NO: 50); 
CMDGEMAVDC (SEQ ID NO: 51); 
CNMGWMSWPC (SEQ ED NO: 52) 
CETYADWLGC (SEQ ID NO: 53); 
CDPWMFFDMC (SEQ ID NO: 54); 
CDPWIWYDLC (SEQ ID NO: 55); 
CDPWIMYDRC (SEQ ID NO: 56); 
CDPWVFFDIC (SEQ ID NO: 57); 
CDPWTYYDLC (SEQ ID NO: 58); 
CDPWIFYDRC (SEQ ID NO: 59); 
CDPWLFYDLC (SEQ ID NO: 60); 
CDPWVWYDLC (SEQ ID NO: 61); 
CDPWIFFDRC (SEQ ID NO: 62); 
CDPWMFFDQC (SEQ ID NO: 63); 
CDPWLWYDRC (SEQ ED NO: 64); 
CDVWVWYDQC (SEQ ID NO: 65); 
CDPWIYYDLC (SEQ ID NO: 66); 
CVP WTLFDLC (SEQ ID NO: 67); 
CPAWYLEYMC (SEQ ID NO: 68); 
CPDWYLEYMC (SEQ ID NO: 69); 
CKYWQWFDKC (SEQ ID NO: 70); and 
CDHWMWYDKC (SEQ ID NO: 71). 

Other preferred formula (I) sequences include, but are not limited to the following: 
GCNREIEAMCCG (SEQ ED NO: 72); 
GCPEWYTDVMCG (SEQ ID NO: 73); 
NWYCMDGEMAVDCEAT (SEQ ID NO: 74); 
WQSCNMGWMSWPCYFV (SEQ ID NO: 75); 
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HELCETYADWLGCVEW (SEQ ID NO: 76); 
PCDPWMFFDMCERW (SEQ ID NO: 77); 
LRGCDPWIWYDLCPAV (SEQ ID NO: 78); 
GYLCDPWYDRCLGF (SEQ ID NO: 79); 
RFACDPWVFFDICGYW (SEQ ID NO: 80); 
GYWCDPWTYYDLCLTA (SEQ ID NO: 81); 
MWTCDPWIFYDRCFLN (SEQ ID NO: 82); 
GSSCDPWLFYDLCLLD (SEQ ID NO: 83); 
GGGCDPWVWYDLCWCD (SEQ ID NO: 84); 
YTSCDPWIFFDRCMSV (SEQ ID NO: 85); 
DPYCDPWMFFDQCAYL (SEQ ID NO: 86); 
REFCDP WLWYDRCL (SEQ ID NO: 87); 
NTGCDVWVWYDQCFAM (SEQ ID NO: 88); 
LVFCDPWIYYDLCMDT (SEQ ID NO: 89); 
GCSFVQLNSICG (SEQ ID NO: 90); 
GCPAWYLEYMCG (SEQ ID NO: 91); 
GCPDWYLEYMCG (SEQ ID NO: 92); 
GCKYWQWFDKCG (SEQ ID NO: 93); and 
GCDHWMWYDKCG (SEQ ID NO: 94). 

B. Compounds of Formula (II): 

In another aspect, compounds are provided comprising a peptide chain 
approximately 9 to 40 amino acids in length that binds to G-CSFR and contains a sequence 
of amino acids of formula (II) 

(II) X'.X'pt'jSGWVWX 1 , (SEQ ED NO: 2) 
wherein each amino acid is indicated by the standard one-letter abbreviation, and wherein 
X\ is S, Q, R, L or Y; X' 2 is N, S, T, A or D; X' 3 is E, D or N; and X\ is L V, T, P or H. 

Preferably X 1 , is S or Q; X' 2 is S; X' 3 is N; and X' 4 is V. 

Examples of particularly preferred sequences satisfying formula (II) include, but 
are not limited to, the following: 

SNESGWVWL (SEQ ID NO: 95); 
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QSNSGWVWV (SEQ ID NO: 96); 
RTESGWVWT (SEQ ID NO: 97); 
RANSGWVWV (SEQ ID NO: 98); 
YDNSGWVWH (SEQ ID NO: 99); and 
LSDSGWVWVP (SEQ ID NO: 100). 



Other preferred formula (II) sequences include, but are not limited to, the following: 
EQSNSGWVWVGGGGC (SEQ ID NO: 101); 
CEQSNSGWVWV (SEQ ID NO: 102); 
EQSNSGWVWVGGGGCKKK (SEQ ID NO: 103); 
EQSNSGWVWVGKKKC (SEQ ID NO: 104); 
EQSNSGWVWVGKKK (SEQ ID NO: 105); 
KKKEQSNSGWVWV (SEQ ID NO: 1 06); 
EQSNSGWVWVGKKKSKKK (SEQ ID NO: 107); 
EQSNSGWVWVGGCKKK (SEQ ID NO: 108); 
EQSNSGWVWVGGGGGGCKKK (SEQ ID NO: 109); 
SNESGWVWLP (SEQ ID NO: 110); 
EQSNSGWVWV (SEQ ID NO: 1 1 1); 
SRTESGWVWT (SEQ ID NO: 1 12); 
QRANSGWVWV (SEQ ID NO: 113); 
DYDNSGWVWH (SEQ ID NO: 1 14); 
EQSNSGWVWVGKKKK (SEQ ID NO: 1 1 5); 
EQSNSGWVWVGGGGSKKK (SEQ ID NO: 1 16); 
EQSNSGWVWVGGGGS (SEQ ID NO: 1 1 7); 
EQSNSGWVWVGGGGSEQSNSGWVWVGGGGS (SEQ ID NO: 1 18); 
RYQSFELSDSGWVWVPVARH (SEQ ID NO: 119); and 
EQSNSGWVWVGGGGCKKKC (SEQ ID NO: 492). 
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C. Compounds of Formula (HI): 

In another aspect, the invention provides compounds comprising a peptide 
chain approximately 6 to 40 amino acids in length that binds to G-CSFR and contains a 
sequence of amino acids of formula (HI) 

(III) ERX n ,X n 2 X II 3C(SEQIDNO:3) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X u , 
is D, L, S, G, E, A, K or Y; X n 2 is W, Y, F, L or V; and X" 3 is F, G, M or L. 

Preferably, X", is D or L; X" 2 is W; and X n 3 is F. 

Examples of particularly preferred sequences satisfying formula (in) include, but 
are not limited to, the following: 

ERDWFC (SEQ ID NO: 120); 
ERDWGC (SEQ ID NO: 121); 
ERLWFC (SEQ ID NO: 122); 
ERSYFC (SEQ ED NO: 123); 
ERGWFC (SEQ ID NO: 124); 
EREWFC (SEQ ID NO: 125); 
ERAWFC (SEQ ED NO: 126); 
ERLYFC (SEQ ID NO: 127); 
ERYFMC (SEQ ID NO: 128); 
ERLFLC (SEQ ID NO: 129); 
ERALMC (SEQ ID NO: 130); 
ERDVMC (SEQ ID NO: 131); and 
ERKWFC (SEQ ID NO: 132). 

Particulary preferred compounds are of the formula: 
ETWGERDWFC (SEQ ID NO: 133); 
ETWGERDWGC (SEQ ED NO: 1 34); 
STAERLWFCG (SEQ ED NO: 135); 
YETAERSYFC (SEQ ID NO: 136); 
ADNAERGWFC (SEQ ED NO: 137); 
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QSNSEREWFC (SEQ ID NO: 138); 

STSERAWFCG (SEQ ID NO: 1 39); 

ASWSERGWFC (SEQ ID NO: 140); 

ELSSEREWFC (SEQ ID NO: 141); 
5 DMQGERGWFC (SEQ ID NO: 142); 

SSSERA WFCG (SEQ ID NO: 143); 

GNMRERLYFC (SEQ ID NO: 144); 

QPNRERYFMC (SEQ ID NO: 145); 

SVTRERLFLC (SEQ ID NO: 146); 
10 IPLSERALMCSSWNC (SEQ ID NO: 147); 

WARSERDVMCLSYVC (SEQ ID NO: 148); 

QSNSEREWFCG (SEQ ID NO: 149); 

QSNSEREWFCGGGGS (SEQ ID NO: 150); 

NLEEALAQERLWFCRSGNC (SEQ ID NO: 151); and 
1 5 NLESYEMEERKWFCKMFSC (SEQ ID NO: 152). 

D. Compounds of Formula (IV): 

In another aspect, compounds are provided comprising a peptide chain 
approximately 9 to 40 amino acids in length that binds to G-CSFR and contains a sequence 
20 of amino acids of formula (IV): 

(IV) X u, 1 MVYX ra 2 X HI 3 PX ,II 4W(SEQIDNO:4) 
wherein each amino acid in indicated by standard one-letter abbreviation, and wherein X" 1 ! 
is D or E; X m 2 is A or T; X ffl 3 is Y or V; and X ,D 4 is P or Y. 

Examples of particularly preferred sequences satisfying formula (TV) include, 
25 but are not limited to, the following: 

DMVYAYPPW (SEQ ID NO: 153); and 
EMVYTVPYW (SEQ ID NO: 154). 



30 



Other preferred formula (IV) sequences include, but are not limited to, the following: 
DMVYAYPPWS (SEQ ID NO: 155); and 
DEMVYTVPYW (SEQ ID NO: 1 56). 
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E. Compounds of Formula (V): 

In another aspect, compounds are provided comprising a peptide chain 
approximately 12 to 40 amino acids in length that binds to G-CSFR and contains a 
sequence of amino acids of formula (V): 

(V) CX w ,X IV 2 X , ^X IV 4 X ,v s X IV 6 X IV 7 X Iv g X ,v 9 X Iv 10 C (SEQ ID NO: 5) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X 1V , 
is E, G, P, N, R, T, W, S, L, H, A, Q or Y; X* is S, T, E, A, D, G, W, P, L, N, V, Y, R or 
M; X w 3 is R, Y, V, Q, E, T, L, P, S, K, M, A or W; X w 4 is L, M, G, F, W, R, S, V, P, A, D, 
C or T; X IV 5 is V, T, A, R, S, L, W, C, I, E, P, H, F, D or Q; X IV 6 is E, Y, G, T, Q, M, S, N, 
A or P; X IV 7 is C, V, D, G, L, W, E, V, I, S, M or A; X Iv g is S, Y, A, W, P, V, L, Q, G, K, 
F, I, E or D; X IV 9 is R W, M, D, H, V, G, A, Q, L, S, E or Y; X IV 10 is M, L, I, S, V, P, W, 
F, T, Y, R, or Q. 

Preferably X™ is E; X IV 2 is S or A; X w 3 is R; X^ 4 is L; X IV 5 is V or S; X IV 6 is 
E; X™ is C; X IV 8 is S; X IV 9 is R; and X w l0 is L. 

Examples of particularly preferred sequences satisfying formula (V) include, but 
are not limited to, the following: 

CESRLVECSRMC (SEQ ID NO: 157); 

CETYMTYVYWLC (SEQ ID NO: 158); 

CGERLAECARLC (SEQ ID NO: 159); 

CESRLRECSMLC (SEQ ID NO: 160); 

CEARLSECSRIC (SEQ ID NO: 161); 

CPARLLECSRMC (SEQ ID NO: 162); 

CESVGVGD WWSC (SEQ ID NO: 163); 

CEDRLVEGPWVC (SEQ ID NO: 164); 

CNDQFRTCVDVC (SEQ ID NO: 165); 

CRGEWWELYHPC (SEQ ID NO: 166); 

CEDTRTGWAWSC (SEQ ID NO: 167); 

CTWLSSGELVWC (SEQ ID NO: 168); 

CWPPVCEVSGIC (SEQ ID NO: 169); 

CSLSPIQLQHLC (SEQ ID NO: 170); 

CLARLEECSRFC (SEQ ID NO: 171); 
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CHNSSPMVGVTC (SEQ ID NO: 172); 
CHVSPVQIKALC (SEQ ID NO: 173); 
CAAPATSWFQYC (SEQ ID NO: 174); 
CASKLHECSLRC (SEQ ID NO: 175); 
CEPMDSNGIVQC (SEQ ID NO: 176); 
CQYASAADEQRC (SEQ ID NO: 177); 
CEYWDEPSLSWC (SEQ ID NO: 178); 
CERECFQMLERC (SEQ ID NO: 179); 
CGMSTDELDEIC (SEQ ID NO: 1 80); 
CYVSPSTGLYSC (SEQ ID NO: 181); 
CEARLVECSRLC (SEQ ID NO: 1 82); 
CESRLSECSRMC (SEQ ID NO: 183); 
CELKLQECARRC (SEQ ID NO: 184); 
CELKLQEAARRC (SEQ ID NO: 185); and 
CLERLEECSRFC (SEQ ID NO: 186). 

Other preferred formula (V) sequences include but are not limited to, the following: 
GGCESRLVECSRMC (SEQ ID NO: 187); 
GGCETYMTYVYWLC (SEQ ID NO: 188); 
EWLCESVGVGDWWSC (SEQ ID NO: 1 89); 
YHPCEDRLVEGPWVCCRS (SEQ ID NO: 190); 
WLLCNDQFRTCVDVCDNV (SEQ ID NO: 191); 
IAECRGEWWELYHPCLAA (SEQ ID NO: 192); 
TWYCEDTRTGWAWSCLEL (SEQ ID NO: 193); 
QLDCTWLSSGELVWCSDW (SEQ ID NO: 194); 
QFDCTWLSSGELVWCSDW (SEQ ID NO: 195); 
CWPPVCEVSGICS (SEQ ID NO: 196); 
CGCSLSPIQLQHLC (SEQ ID NO: 197); 
CGCHVSPVQIKALC (SEQ ID NO: 198); 
GCHVSPVQIKALC (SEQ ID NO: 199); 
GTSCAAPATSWFQYCVLP (SEQ ID NO: 200); 



WO 02/07676 



-16- 



PCT/US01/23046 



RMDCASKLHECSLRCAYA (SEQ ID NO: 201); 
GWCEPMDSNGIVQCSMR (SEQ ID NO: 202); 
IDVCQYASAADEQRCLRI (SEQ ID NO: 203); 
NVLCEYWDEPSLSWCLSS (SEQ ID NO: 204); 
5 CQCERECFQMLERC (SEQ ID NO: 205); 

FCSCGMSTDELDEICAIW (SEQ ID NO: 206); 
EEVCYVSPSTGLYSCYDQ (SEQ ID NO: 207); 
LLDICELKLQECARRCN (SEQ ID NO: 208); 
GGGLLDICELKLQECARRCN (SEQ ID NO: 209); 
10 GRTGGGLLDICELKLQECARRCN (SEQ ID NO: 210); 

LGIEGRTGGGLLDICELKLQECARRCN (SEQ ID NO: 211); 
LLDICELKLQEAARRCN (SEQ ID NO: 212); and 
KLLDICELKLQEAARRCN (SEQ ID NO: 2 1 3). 

1 5 Particularly preferred formula (V) sequences are selected from the group consisting of: 

LLDICELKLQECARRCN (SEQ ID NO: 208); 

GGGLLDICELKLQECARRCN (SEQ ID NO: 209); 

GRTGGGLLDICELKLQECARRCN (SEQ ID NO: 210); 

LGIEGRTGGGLLDICELKLQECARRCN (SEQ ID NO: 21 1); 
20 LLDICELKLQEAARRCN (SEQ ID NO: 212); and 

KLLDICELKLQEAARRCN (SEQ ID NO: 213). 

F. Compounds of Formula (VI): 

In another aspect, compounds are provided comprising a peptide chain 
25 approximately 9 to 40 amino acids in length that binds to G-CSFR and contains a sequence 
of amino acids of formula (VI): 

(VI) X V 1 X V 2 X V 3 X V 4 X V S X V 6 CX V 7 X V I! (SEQ ID NO: 6) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X v , 
is E, C, Q, V, or Y; X v 2 is E, A, L, M, S, W, or Q; X v 3 is K, R or T; X v 4 is L, A, or V; X v 5 
30 is R, A, M, H, E, V, L, G, D, Q, or S; X v 6 is E or V; X v 7 is A or G; X v 8 is R, H, G or L. 
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Preferably X v , is E; X v 2 is A or L; X v 3 is K or R; X v 4 is L; X v 6 is E; X v 7 is A; 
and X v 8 is R. 

Examples of particularly preferred sequences satisfying formula (VI) include, 
but are not limited to, the following: 

EEKLRECAR (SEQ ID NO: 214); 
EARLAECAR (SEQ ID NO: 215); 
CMKLMECAR (SEQ ID NO: 216); 
ELRLRECAH (SEQ ID NO: 217); 
EAKLHECAR (SEQ ID NO: 218); 
ELKLAECAR (SEQ ID NO: 219); 
EARLEECAR (SEQ ID NO: 220); 
EAKLRECAR (SEQ ID NO: 221); 
ELRLAECAR (SEQ ID NO: 222); 
ESRLAECAR (SEQ ID NO: 223); 
EAKLVECAR (SEQ ID NO: 224); 
ESRLRECAR (SEQ ID NO: 225); 
EAKLAECAR (SEQ ID NO: 226); 
QWRLEECAR (SEQ ID NO: 227); 
QLRLEECAR (SEQ ID NO: 228); 
ELRLEECAR (SEQ ID NO: 229); 
EAKLLECAR (SEQ ID NO: 230); 
EARAGVCAG (SEQ ID NO: 23 1); 
EAKAGVCAG (SEQ ID NO: 232); 
VARLEECAR (SEQ ID NO: 233); 
ELKLDECAR (SEQ ID NO: 234); 
EWRLQECAR (SEQ ID NO: 235); 
EAKLSECAR (SEQ ID NO: 236); 
EARLSECAR (SEQ ID NO: 237); 
ELKLLECAR (SEQ ID NO: 238); 
ELRLQECGR (SEQ ID NO: 239); 
EQKLAECAR (SEQ ID NO: 240); 
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ELRLQECAR (SEQ ID NO: 241); 
ELKLEECAR (SEQ ID NO: 242); 
ESRLEECAR (SEQ ID NO: 243); 
EATVQECAR (SEQ ID NO: 244); 
ELKLQECAR (SEQ ID NO: 245); 
YSRLEECGR (SEQ ID NO: 246); 
ELRLRECAL (SEQ ID NO: 247); 
EARLLECAR (SEQ ED NO: 248); 
ESRLLECAR (SEQ ID NO: 249); 
VLKLEECAR (SEQ ID NO: 250); 
ESKLAECAR (SEQ ID NO: 251); 
ESKLRECAR (SEQ ID NO: 252); 
EYKLGECAR (SEQ ID NO: 253); 
ESRLQECAR (SEQ ID NO: 254); 
QARLAECAR (SEQ ID NO: 255); 
ELKKQECAR (SEQ ID NO: 256); 
ESRLSECAR (SEQ ID NO: 257); 
EARLEECGR (SEQ ID NO: 258); 
ESRLAECGR.(SEQ ID NO: 259); 
EWRLEECAR (SEQ ID NO: 260); 
EARLSECGR (SEQ ID NO: 261); 
AARLAECAR (SEQ ID NO: 262); 
EWKLAECAR (SEQ ID NO: 263); 
ESKLEECAR (SEQ ID NO: 264); 
DVKLAECAR (SEQ ID NO: 265); 
ELQLEECAR (SEQ ID NO: 266); and 
EYKLASCAR (SEQ ID NO: 267). 



Other preferred formula (VI) sequences include but are not limited to, the following: 
RLSICEEKLRECARGC (SEQ ID NO: 268); 
PLTTCEARLAECARQL (SEQ ID NO: 269); 
LALCMKLMECARRY (SEQ ID NO: 270); 
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ELVMCELRLRECAHRA (SEQ ID NO: 271); 
PLARCEAKLHECARQL (SEQ ID NO: 272); 
LLSVCELKLAECARSK (SEQ ID NO: 273); 
RLEWCEARLEECARRC (SEQ ID NO: 274); 
RLRWEAKLRECARGR (SEQ ID NO: 275); 
CVAHLELRLAECARQI (SEQ ID NO: 276); 
HLARCESRLAECARQL (SEQ ID NO: 277); 
RLALLEAKLVECARRL (SEQ ID NO: 278); 
DLFSLESRLRECARRV (SEQ ID NO: 279); 
AVPVLEAKLAECARRF (SEQ ID NO: 280); 
YLQQLQWRLEECARGM (SEQ ID NO: 281); 
YLELCQLRLEECARQFN (SEQ ID NO: 282); 
ELHICELRLEECARGR (SEQ ID NO: 283); 
RVARCELRLAECARKS (SEQ ID NO: 284); 
YLEVLESRLAECARWK (SEQ ID NO: 285); 
EAKLLECARAR (SEQ ID NO: 286); 
ELSLCEARAGVCAGSVTK (SEQ ID NO: 287); 
ELSLCEAKAGVCAGSVTK (SEQ ID NO: 288); 
ALWQCVARLEECARSR (SEQ ID NO: 289); 
CLKSCELKLDECARRM (SEQ ID NO: 290); 
ALQTCEWRLQECARSR (SEQ ID NO: 291); 
YISQCEAKLAECARLY (SEQ ID NO: 292); 
ELS SCEAKLSECARRW (SEQ ID NO: 293); 
ELS SCEARLSECARRW (SEQ ID NO: 294); 
QLLQCELKLLECARQG (SEQ ID NO: 295); 
ELLRCEARLAECARGC (SEQ ID NO: 296); 
QLRQCELRLQECGRHGN (SEQ ID NO: 297); 
PLTSCEQKLAECARRF (SEQ ID NO: 298); 
LLGMCELRLQECARAK (SEQ ID NO: 299); 
ELSRCELKLEECARGM (SEQ ID NO: 300); 
DCRPCESRLEECARRL (SEQ ID NO: 301); 
RLSVCEARLEECARQL (SEQ ID NO: 302); 



-20- 



PLKMCEATVQECARLI (SEQ ID NO: 303); 
LLLFCEARLSECARHV (SEQ ID NO: 304); 
SLSMCEARLAECARLL (SEQ ID NO: 305); 
PLFSCELKLQECARRCN (SEQ ID NO: 306); 
SLERCYSRLEECGRPJ (SEQ ID NO: 307); 
PLTSCELRLRECALRSN (SEQ ID NO: 308); 
KLAACELKLAECARRW (SEQ ID NO: 309); 
KLAACELRLAECARRW (SEQ ID NO: 310); 
ALTRCELRLAECARKI (SEQ ID NO: 311); 
LLQQCELKLAECARSI (SEQ ID NO: 312); 
QLWQCEARLLECARRS (SEQ ID NO: 313); 
RLRLCESRLLECARSL (SEQ ID NO: 314); 
QLETCVLKLEECARRCN (SEQ ID NO: 315); 
ALSQCELRLAECARSVTK (SEQ ID NO: 316); 
ELKLAECARRS (SEQ ID NO: 317); 
ALSRCESKLAECARRQ (SEQ ID NO: 318); 
LMSTCESKLRECARSL (SEQ ID NO: 319); 
SLQRCEYKLGECARSL (SEQ ID NO: 320); 
RLELLESRLQECARQLN (SEQ ID NO: 321); 
QMEWCQARLAECARCCN (SEQ ID NO: 322); 
PLFSCELKKQECARRCN (SEQ ID NO: 323); 
LLDKCESRLSECARRL (SEQ ID NO: 324); 
LLARCEARLEECGRQC (SEQ ID NO: 325); 
DLLYCESRLAECGRM (SEQ ID NO: 326); 
ALQMCEWRLEECARRL (SEQ ID NO: 327); 
LLTMCEARLSECGRRL (SEQ ID NO: 328); 
ALWRCESRLAECARRS (SEQ ID NO: 329); 
LLATCAARLAECARQL (SEQ ID NO: 330)'; 
LQTCEWKLAECARSN (SEQ ID NO: 331); 
PLRSCESKLEECARQL (SEQ ID NO: 332); 
CLRALDVKLAECARHL (SEQ ID NO: 333); 
RLKTLELQLEECARRS (SEQ ID NO: 334); 
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KLRDVELKLAECARRS (SEQ ID NO: 335); 
SLQRCEYKLASCARSL (SEQ ID NO: 336); 
RLARCELRLAECARKS (SEQ ID NO: 337); 
DLWYLESKLEECARRCN (SEQ ID NO: 338); 
DLWYLESKLEECARRANG (SEQ ID NO: 339); 
DLWYLESKLEECARRCNG (SEQ ID NO: 340); 
KQRELELKLAECARRS (SEQ ID NO: 341); 
QMQEWCARLAECARCCN (SEQ ID NO: 342); and 
LLDICELKLQECARRAN (SEQ ID NO: 343). 

A particularly preferred sequence of formula (VI) is: 

LLDICELKLQECARRAN (SEQ ID NO: 343). 

G. Compounds of Formula (VH): 

In another aspect, the invention provides compounds comprising a peptide chain 
approximately 1 0 to 40 amino acids in length that binds to G-CSFR and contains a 
sequence of amino acids of formula (VII): 

(VII) X VI 1 X VI 2 X VI 3X V, 4 X VI 5EX VI 6 X VI 7 X VI 8 X VI 9 (SEQ ID NO: 7) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X^ 
is A, E or G; X^ is E, H or D; X* is R or G; X^ is K, Y, M, N, Q, R, D, I, S or E; X^ 
is A, S or P; X\ is E, D, T, Q, K or A: X^ is R, W, K, L, S, A or Q; X^ is R or E; and 
X v, 9 is W, G, or R 

Preferably X™, is A; X™ 2 is E; X^ is R; X^ is A; X\ is E; X^ is R; X^ is 
RandX^is W. 

Examples of particularly preferred sequences satisfying formula (VII) include, 
but are not limited to, the following: 

AERKAEERRW (SEQ ID NO: 344); 
AERYAEEREG (SEQ ID NO: 345); 
AERMAEERRW (SEQ ID NO: 346); 
AERKAEERRR (SEQ ID NO: 347); 
AHRNAEERRW (SEQ ID NO: 348); 
AERKSEDWRW (SEQ ID NO: 349); 
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AERKAEEKRR (SEQ ID NO: 350); 
AERQAETRRW (SEQ ID NO: 35 1); 
AERNAEERRW (SEQ ID NO: 352); 
AERQAEERRW (SEQ ID NO: 353); 
AERRAEERRW (SEQ ID NO: 354); 
AERDAEQRRW (SEQ ID NO: 355); 
AERIAEERRW (SEQ ID NO: 356); 
AERSAEERRW (SEQ ID NO: 357); 
AERKAEELRW (SEQ ID NO: 358); 
AERKAEESRW (SEQ ID NO: 359); 
EERKAEERRW (SEQ ED NO: 360); 
ADGKAEERRW (SEQ ID NO: 361); 
ADGKAEELRW (SEQ ID NO: 362); 
ADGMPEERRW (SEQ ID NO: 363); 
ADGEAEKRRW (SEQ ID NO: 364); 
ADGNAEERRW (SEQ ID NO: 365); 
ADGEAEKARW (SEQ ID NO: 366); 
AEGEAEKARW (SEQ ID NO: 367); 
GERKAEERRW (SEQ ID NO: 368); 
AEREAEERRW (SEQ ID NO: 369); 
ADGEAEARRW (SEQ ID NO: 370); 
ADGRAEEARW (SEQ ID NO: 371); 
AEGRAEEARW (SEQ ID NO: 372); 
AEREAEKARW (SEQ ID NO: 373); 
AERKAEEQRW (SEQ ID NO: 374); 
AERDAEKRRW (SEQ ID NO: 375); and 
AEREAEKLRW (SEQ ID NO: 376). 

Other preferred formula (VI) sequences include but are not limited to, the following: 
MLAERKAEERRWFNTHGRE (SEQ ID NO: 377); 
MLAERKAEERRWFNTHGREK (SEQ ID NO: 378); 
GGGMLAERKAEERRWFNTHGRE (SEQ ID NO: 379); 
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CMLAERKAEERRWFNTHGRE (SEQ ID NO: 380); 
CMLAERKAEERRWFNTHGREK (SEQ ID NO: 381); 
MLAERYAEEREGFNMQWRE (SEQ ID NO: 382); 
MLAERMAEERRWFRRMG (SEQ ID NO: 383); 
IVAERKAEERRRLNTEGHE (SEQ ID NO: 384); 
ILAHRNAEERRWFQKHGR (SEQ ID NO: 385); 
MLAERKSEDWRWLKTHGRD (SEQ ID NO: 386); 
MLAERKAEEKRRLKTQGRE (SEQ ID NO: 387); 
ILAERQAETRRWMRNAGSVTK (SEQ ID NO: 388); 
MLAERNAEERRWLKRQCG (SEQ ID NO: 389); 
MLAERQAEERRWLKMHGGE (SEQ ID NO: 390); 
MLAERRAEERRWLKTQGGD (SEQ ID NO: 391); 
MLAERQAEERRWLKTQGRD (SEQ ID NO: 392); 
MLAERKAEERRWFKTHGRE (SEQ ID NO: 393); 
MLAERKAEERRWFNNQGRE (SEQ ID NO: 394); 
MPAERDAEQRRWLKTHGRE (SEQ ID NO: 395); 
ILAERIAEERRWLKTQGR (SEQ ID NO: 396); 
MLAERKAEERRWLQTHGRE (SEQ ID NO: 397); 
ILAERSAEERRWLKTQGRE (SEQ ID NO: 398); 
LLAERKAEELRWLKTHGRE (SEQ ID NO: 399); 
MLAERKAEERRWLQTHGRE (SEQ ID NO: 400); 
MLAERNAEERRW (SEQ ID NO: 401); 
MFAERKAEESRWLQSQGRE (SEQ ID NO: 402); 
MLEERKAEERRWLKTHGR (SEQ ID NO: 403); 
MLAERKAEERRWLKMQGRE (SEQ ID NO: 404); 
MLAERNAEERRWFYTHGRE (SEQ ID NO: 405); 
MLADGKAEERRWLKTHGLD (SEQ ID NO: 406); 
MIADGKAEERRWLKTHGRD (SEQ ID NO: 407); 
MLADGKAEELRWLKTQGSD (SEQ ID NO: 408); 
MLAERNAEERRWLKTHGRD (SEQ ID NO: 409); 
MLADGKAEELRWLKTQGRE (SEQ ID NO: 410); 
ILADGKAEERRWLKTHGRD (SEQ ID NO: 41 1); 
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ML AD GMPEERRWLQTHGRD (SEQ ID NO: 412); 
ML AD GEAEKRR WLNTHGRD (SEQ ID NO: 413); 
MLADGNAEERRWLMTHGRD (SEQ ID NO: 414); 
ML AD GEAEKAR WLKTQGRE (SEQ ID NO: 415); 
MLAEGEAEKARWLKTQGRE (SEQ ED NO: 416); 
MLADGKAEERRWLKTQGRE (SEQ ID NO: 417); 
MLAERKAEERRWLSAHVRE (SEQ ID NO: 418); 
LLGERKAEERRWYKTHARE (SEQ ID NO: 419); 
MLAERKAEERRWLMTHGHD (SEQ ID NO: 420); 
MLAERKAEERRWLKSQCLE (SEQ ID NO: 421); 
LLAEREAEERRWFKTHGRE (SEQ ID NO: 422); 
ML AD GEAE ARR WFNMHGRE (SEQ ID NO: 423); 
MLADGRAEEARWLKTQGSE (SEQ ID NO: 424); 
MLAEGRAEEARWLKTQGSE (SEQ ID NO: 425); 
ML AEREAEKAR WLKTQGRE (SEQ ID NO: 426); 
MMAERKAEEQRWFDIHGRD (SEQ ID NO: 427); 
LTAERDAEKRRWLLTHGGE (SEQ ID NO: 428); 
MLAERQAEERRWLKSQRGE (SEQ ID NO: 429); 
LLAERKAEERRWFATHGRD (SEQ ID NO: 430); 
MLAEREAEKLRWLKSQERA (SEQ ID NO: 431); 
MLAERKAEERRWLKTHGGE (SEQ ID NO: 432); 
KGGGMLAERKAEERRWFNTHGRE (SEQ ID NO: 490); and 
KSTGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 491). 

H. Other Active Compounds 

In another aspect of the invention, there are provided additional compounds 
comprising a peptide chain approximately 5 to 40 amino acids in length that binds to G- 
CSFR and contains a sequence of amino acids selected from the following compounds: 

CTWTDLESVY (SEQ ID NO: 433); 

HTTNEQFFMC (SEQ ID NO: 434); 

DTWLELESRY (SEQ ID NO: 435); 

HNSSPMVGVT (SEQ ID NO: 436); 
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DWQKTIPAYW (SEQ ID NO: 437); 
RWGREGLVAALL (SEQ ID NO: 438); 
WSGTRVWRCWT (SEQ ID NO: 439); 
MSLLSYLRS (SEQ ID NO: 440); 
LDLLAI(SEQIDNO:441); 
RIYGVK (SEQ ID NO: 442); 
MWHMFMSLLF (SEQ ID NO: 443); 
FFWASWMHLLW (SEQ ID NO: 444); 
FDDCWREREQFLFQAL (SEQ ID NO: 445); 
CGRASECFRLLEM (SEQ ID NO: 446); 
RECFQMLER (SEQ ID NO: 447); 
CSIRWDFVPGYGLC (SEQ ID NO: 448); 
WMQCWDSLSLCYDM (SEQ ID NO: 449); 
ALLMCESKLAECARAR (SEQ ID NO: 450); 
LAHCKKRKEECAAG (SEQ ID NO: 451); 
SIDGVYLRTSRT (SEQ ID NO: 452); 
SIDGVYLRTRSRTRY (SEQ ID NO: 453); 
VRWLRGSTLRGLRDR (SEQ ID NO: 454); 
DRGGGTVGVYWWES Y (SEQ ID NO: 455); 
VWGTVGTWLEY (SEQ ID NO: 456); 
LMWVSAY (SEQ ID NO: 457); 
RASDEYGALVRFCTNL (SEQ ID NO: 458); 
NYWCDSNWVCEIA (SEQ ID NO: 459); 
LAHCLLRLEECAAG (SEQ ID NO: 460); 
LALCLARLRECAGG (SEQ ID NO: 461); 
CESRLVECSRM (SEQ ID NO: 462); 
LLDIAELKLQECARRCN (SEQ ED NO: 463); 
KLLDIAELKLQECCARRCN (SEQ ID NO: 464); 
CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 465) 
LTAERDAEKRRWLLTHGGEGG (SEQ ID NO: 466); 
LTAERDAEKRRWLLTHGGEGGK (SEQ ID NO: 467); 
LTAERDAEKRRWLLTHGGEGGGGG (SEQ ID NO: 468); 
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LTAERDAEKRRWLLTHGGEGGGGGK (SEQ ID NO: 469); 

ESGWVW (SEQ ID NO: 470); 

NSGWVW (SEQ ID NO: 471); 

SGWVW (SEQ ID NO: 472); 
5 PLGKCEATCREMARYFN (SEQ ID NO: 473); 

SLQRCEYKLASVRGLCN (SEQ ID NO: 474) 

DLWYLESKLEEAARRCNG (SEQ ID NO: 475); 

PYMGTRSRAKLLRQQ (SEQ ID NO: 476); 

RNAGERRWFKTQGWY (SEQ ID NO: 477); 
10 MLAERNADDRRWFNTHGRD (SEQ ID NO: 478); 

MMADGRLRNSVGLILWCD (SEQ ID NO: 479); 

MLADGRLRNWG (SEQ ID NO: 480); 

LLADVRRRNGVGLLRMGRD (SEQ ID NO: 481); 

MLADGRLRNFGG (SEQ ID NO: 482); 
1 5 TYMTYVYWLC (SEQ ID NO: 483); 

RFGERWGL (SEQ ID NO: 484); 

HWLWWGWNF (SEQ ID NO: 485); 

RECFQMLERC (SEQ ID NO: 486); 

ILAHRNAKERRWFQKHGR (SEQ ID NO: 487); and 
20 CSTGGGLTAERDAEKRRWLLTHGGEK (SEQ ID NO: 489). 

Particularly preferred sequences are selected from the group consisting of: 
LLDIAELKLQECARRCN (SEQ ID NO: 463); and 
KLLDIAELKLQECCARRCN (SEQ ID NO: 464). 

25 

I. Synthesis of the Peptides: 

Standard solid phase peptide synthesis techniques are preferred for synthesis of 
the peptides of the present invention. Such techniques are described, for example, by 
Merrifield (1963) J. Am. Chem. Soc. §5:2149. As is well known in the art, solid phase 
30 synthesis using the Merrifield method involves successive coupling of a-amino protected 
amino acids to a growing support-bound peptide chain. After the initial coupling of a 
protected amino acid to a resin support (e.g., a polystyrene resin, a chloromethylated resin, 
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a hydroxymethyl resin, a benzhydrylamine resin, or the like, depending on the chemistry 
used), the a-amino protecting group is removed by a choice of reagents, depending on the 
specific protecting group. Suitable a-amino protecting groups are those known to be 
useful in the art of stepwise synthesis of peptides. Included are acyl type protecting groups 
5 (e.g., formyl, trifluoroacetyl, acetyl), aromatic urethane type protecting groups (e.g., 

benzyloxycarbonyl (Cbz) and substituted Cbz), aliphatic urethane protecting groups (e.g., 
t-butyloxycarbonyl (Boc), isopropyloxycarbonyl, cyclohexyloxycarbonyl), alkyl type 
protecting groups (e.g., benzyl, triphenylmethyl), fluorenylmethyl oxycarbonyl (Fmoc), 
alloxycarbonyl (Alloc) and Dde. The side chain protecting groups (typically ethers, esters, 

10 trityl, and the like) remain intact during coupling; however, the side chain protecting group 
must be removable upon completion of the synthesis of the final peptide. Preferred side 
chain protecting groups, as will appreciated by those skilled in the art, will depend on the 
particular amino acid that is being protected as well as the overall chemistry used. After 
removal of the a-amino protecting group, the remaining protected amino acids are coupled 

1 5 stepwise in the desired order. Each protected amino acid is generally reacted in about a 
3 -fold excess using an appropriate carboxyl group activator such as 
2-(lH-benzotriazol-l-yl)-l,l ? 3,3 tetramethyluronium hexafluorophosphate (HBTU) or 
dicyclohexylcarbodiimide (DCC) in solution, for example, in methylene chloride 
(CH 2 C1 2 ), N-methyl pyrrolidone, dimethyl formamide (DMF), or mixtures thereof. 

20 Once the synthesis is complete, the compound is cleaved from the solid support 

by treatment with a reagent such as trifluoroacetic acid, preferably in combination with a 
scavenger such as ethanedithiol, p-mercaptoethanol or thioanisole. The cleavage reagent 
not only cleaves the peptide from the resin, but also cleaves all remaining side chain 
protecting groups. 

25 These procedures can also be used to synthesize peptides containing amino acids 

other than the 20 naturally occurring, genetically encoded amino acids. For instance, 
naphthylalanine can be substituted for tryptophan, with 1-Nal or 2-Nal. Other synthetic 
amino acids that can be substituted into the peptides of the present invention include, but 
are not limited to, nor-leucine and 3-pyridylalanine. 



30 
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in. Variations and Modifications 

A. Dimer Forms (With a Terminal Linking Moiety): 

The compounds of the present invention may be in the form of a dimer, i.e., a 

compound comprised of two similar (but not necessarily identical) peptide sequences. 

Preferably, the dimer compounds of the invention have the structure of formula (VHI) 

„ mr ^ y (PA) n4 R 2 (PA) n2 

(VIII) / \ 

(Lk) x ^>Lk) y 

S (PA) n3 R 1 (PA) n1 



wherein R 1 , R 2 , nl, n2, n3, n4, x, y and Lk are defined as follows. 

R 1 is a peptide chain that binds to G-CSFR and contains a sequence of amino 
acids of the present invention. R 2 is also a peptide chain that binds to G-CSFR and 
contains a sequence of amino acids of the present invention. As previously indicated, R 1 
and R 2 can be the same or different. It is preferred, however, that R 1 and R 2 are the same. 

pA is a P-alanine residue and may or may not be present, meaning that nl, n2, 
n3 and n4 are independently zero or 1 . 

Lk is a terminal linking moiety. If the dimer contains only one linking moiety, 
one of x and y is zero and the other is one. Alternatively, if the dimer contains two linking 
moieties, both x and y are one. Thus, x and y are independently zero or one with the 
proviso that the sum of x and y is either one or two. 

The terminal linking moiety Lk can be any moiety recognized by those skilled in 
the art as suitable for joining the peptides of R 1 and R 2 . Lk is preferably although not 
necessarily selected from the group consisting of a disulfide bond, a carbonyl moiety and a 
C M2 linking moiety optionally terminated with one or two -NH- linkages and optionally 
substituted at one or more available carbon atoms with a lower alkyl substituent. 
Preferably, the terminal linking moiety comprises -NH-R 3 -NH- wherein R 3 is lower (C^) 
alkylene substituted with a functional group such as a carboxyl group or an amino group 
that enables coupling to another molecular moiety (e.g., as may be present on the surface 
of a solid support), and is optionally substituted with a lower alkyl group. Optimally, the 
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linking moiety is a lysine residue or lysine amide, i.e., a lysine residue wherein the 
carboxyl group has been converted to an amide moiety -CONH 2 . 

NH 2 -EQSNSGWVWVGGGGC-CONH 2 (SEQ ED NO: 101) 
NH 2 -EQSNSGWVWVGGGGC-CONH 2 (SEQ ID NO: 101); 

CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 465) 
1 0 CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 489); 



MLAERKAEERRWFNTHGRE (SEQ ID NO: 377) 
MLAERKAEERRWFNTHGRE-K(NH 2 ) (SEQ ID NO: 378); 

CMLAERKAEERRWFNTHGRE (SEQ ID NO: 380) 
CMLAERKAEERRWFNTHGRE-K (SEQ ID NO: 381); 



20 LTAERDAEKRRWLLTHGGEGG (SEQ ID NO: 466) 

LTAERDAEKRRWLLTHGGEGG-K (SEQ ID NO: 467); and 



25 LTAERDAEKRRWLLTHGGEGGGGG (SEQ ID NO: 468) 

LTAERDAEKRRWLLTHGGEGGGGG-k (SEQ ID NO: 469). 



B. Disulfide Bonds: 

30 When a pair of cysteine residues is present in a peptide of the invention, it is 

preferred that the pair form a disulfide bond linking these residues. The disulfide bond 
may be present within a single peptide chain forming an intramolecular disulfide bond. 
Alternatively, if the compound includes an additional cysteme-containing peptide chain, 
the disulfide bond may connect the two chains. In addition, where an additional pair of 

3 5 cysteine residues exists in the compound, more than one disulfide bond may be present. 

Disulfide bond formation may be effected by techniques well known to those 
skilled in the art. One such technique involves employing a suitable oxidizing reagent 
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such that a disulfide bond forms from the free thiols from a pair of cysteine residues. 
Undesired disulfide bond formation can be nnnimized, for example, by protecting the thiol 
groups of those cysteine residues not intended to form disulfide bonds and oxidizing the 
peptide before removal of any protecting groups. Preferred compounds having disulfide 
5 bonds include, by way of example, the following: 



NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135) 



10 



NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135); 

NH 2 -QSNSEREWFC-CONH 2 (SEQIDNO: 138) 
NH 2 -QSNSEREWFC-CONH 2 (SEQ ID NO: 138); 



15 NH 2 -QSNSEREWFCG-CONH 2 (SEQ ID NO: 149) 

NH 2 -QSNSEREWFCG-CONH 2 (SEQ ID NO: 149); 



20 



25 



35 



[H]-DLWYLESKLEECARRANG-[NH 2 ] (SEQ ID NO: 339) 
[Hj-DLWYLESKLEECARRANG-fNHJ (SEQ ID NO: 339); 



[H]-DLWYLESKLEEAARRCNG -[NHJ (SEQ ID NO: 475) 
PHQ-DLWYLESKLEEAARRCNG-fNHJ (SEQ ID NO: 475); 

[H]-DLWYLESKLEECARRCNG -[NH 2 ] (SEQ ID NO: 340); 



30 [H]-LLDICELKLQECARRAN-[OH] (SEQ ID NO: 343); 



[H]-LLDICELKLQEAARRCN-[OH] (SEQ ID NO: 212); 



[H]-K-LLDICELKLQEAARRCN-[OH] (SEQ ID NO: 231); 

[Bdotin] ' 
[H]-LLDIAELKLQECARRCN-[OH] (SEQ ID NO: 463); 

^ [H]-KLLDIAELKLQECARRCN-[OH] (SEQ ID NO: 464); and 
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NH 3 + -LLDICELKLQECARRCN-COO" (SEQ ID NO: 208) 

I I I 

NH 3 + -LLDICELKLQECARRCN-COO" (SEQ ID NO: 208). 

5 A particularly preferred compound having disulfide bonds includes 

NH 3 + -LLDICELKLQECARRCN-COO" (SEQ ID NO: 208) 

I I I 

NH 3 + -LLDICELKLQECARRCN-COO" (SEQ ID NO: 208). 

10 

C. N-Terminal Modifications: 

(i) PEGylated Compounds 
The peptides and compounds of the invention can advantageously be modified 
1 5 with or covalently coupled to one or more of a variety of hydrophilic polymers. It has been 
found that when the peptide compounds are derivatized with a hydrophilic polymer, their 
solubility and circulation half-lives are increased and their immunogenicity is masked. 
Quite surprisingly, the foregoing can be accomplished with little, if any, diminishment in 
binding activity. Nonproteinaceous polymers suitable for use in accordance with the 
20 present invention include, but are not limited to, polyalkylethers as exemplified by 
polyethylene glycol and polypropylene glycol, polylactic acid, polyglycolic acid, 
polyoxyalkenes, polyvinylalcohol, polyvinylpyrrolidone, cellulose and cellulose 
derivatives, dextran and dextran derivatives, etc. Generally, such hydrophilic polymers 
have an average molecular weight ranging from about 500 to about 100,000 daltons, more 
25 preferably from about 2,000 to about 60,000 daltons and, even more preferably, from 
about 5,000 to about 50,000 daltons. In preferred embodiments, such hydrophilic 
polymers have average molecular weights of about 5,000 daltons, 10,000 daltons 20,000 
daltons and 40,000 daltons. 

The peptide compounds of the invention can be derivatized with or coupled to 
30 such polymers using any of the methods set forth in Zallipsky (1995) Bioconjugate Chem. 
6:150-165; Monfardini et al. (1995) Bioconjugate Chem. 6:62-69; U.S. Patent No. 
4,640,835; U.S. Patent No. 4,496,689; U.S. Patent No. 4,301,144; U.S. Patent No. 
4,670,417; U.S. Patent No. 4,791,192; U.S. Patent No. 4,179,337 or WO 95/34326. 

In a preferred embodiment, the N-terminus of a peptide of the invention is 
35 coupled to a polyethylene glycol molecule. It is particularly preferred that the polymer is 



WO 02/07676 



PCT/US01/23046 



-32- 

selected from the group consisting of polyethylene glycol, polypropylene glycol, polylactic 
acid, polyglycolic acid and derivatives and combinations thereof. Most preferably the 
polymer is polyethylene glycol (PEG), in which case the peptide is referred to as 
"PEGylated." PEG is a linear, water-soluble polymer of ethylene oxide repeating units 
5 with two terminal hydroxyl groups. PEGs are classified by their molecular weights which 
typically range from about 500 daltons to about 40,000 daltons. In a presently preferred 
embodiment, the PEGs employed have an average molecular weight of from about 500 to 
about 80,000 daltons. It is particularly preferred that the polymer has an average 
molecular weight of between about 5,000 to 40,000 daltons. 

1 0 The PEG coupled to the peptide compounds of the invention can be either 

branched or unbranched. (See, e.g. Monfardini et al. (1995) Bioconjugate Chem. 6:62-69.) 
PEG is commercially available from Shearwater Polymers, Inc. (Huntsville, Alabama), 
Sigma Chemical Co. and other companies. Suitable PEGs include, but are not limited to, 
monomethoxypolyethylene glycol (MePEG-OH), monomethoxypolyethylene glycol- 

15 succinate (MePEG-S), monomethoxypolyethylene glycol-succinimidyl succinate (MePEG- 
S-NHS.), monomethoxypolyethylene glycol-amine (MePEG-NH^, 
monomethoxypolyethylene glycol-tresylate (MePEG-TRES) and 
monomethoxypolyethylene glycol-imidazolyl-carbonyl (MePEG-IM). 

Briefly, in one exemplary embodiment, the hydrophilic polymer which is 

20 employed, e.g., PEG, is capped at one terminus by an unreactive group such as a methoxy 
or ethoxy group. Thereafter, the polymer is activated at the other terminus by reaction 
with a suitable activating agent, such as a cyanuric halide (e.g., cyanuric chloride, bromide 
or fluoride), diimidazole, an anhydride reagent (e.g., a dihalosuccinic anhydride, such as 
dibromosuccinic anhydride), acyl azide,/?-diazoniumbenzyl ether, 

25 3-(p-diazoniumphenoxy)-2-hydroxypropylether, or the like. The activated polymer is then 
reacted with a peptide compound of the invention to produce a polymer-derivatized 
peptide compound. Alternatively, a functional group in the peptide compounds of the 
invention can be activated for reaction with the polymer, or two groups can be joined in a 
concerted coupling reaction using known coupling methods. It will be readily appreciated 

30 that the peptide compounds of the invention can be derivatized with PEG using a myriad 
of other reaction schemes known to those of skill in the art. 
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(U) ACETYLATED COMPOUNDS 

In some instances, the N-terminus of the peptide is acetylated. Preferred 
acetylated compounds include, by way of example, the following: 
Ac-ESGWVW-CONH 2 (SEQ ID NO: 470); 
Ac-NSGWVW-CONH 2 (SEQ ID NO: 471); and 
Ac-SGWVW-CONH 2 (SEQ ID NO: 472). 

The peptides and compounds of the invention can be modified with an acetyl 
moiety (Ac) using standard techniques known to those skilled in the art. One such 
technique includes combining the peptide with an acetylating reagent (e.g., acetyl chloride, 
acetic anhydride) in a suitable solvent to form the acetylated product. To the extent that 
other acetylated products are formed during the reaction, the N-terminus derivative can be 
isolated using conventional separation techniques. 

D. C-Terminal Modifications: 

The peptides and compounds of the invention can advantageously be modified 
to include an amide functionality at the carboxyl terminus of the peptide. Thus, it is 
preferred that the C-terminus of the peptide is amidated. 

In preparing peptides wherein the C-terminus carboxyl group is replaced by the . 
amide -C(0)NR 3 R 4 where R 3 and R 4 are independently H or lower (C h6 ) alkyl, a 
benzhydrylamine resin is preferably used as the solid support for peptide synthesis. Upon 
completion of the synthesis, a hydrogen fluoride treatment is employed to release the 
peptide from the support, directly resulting in the free peptide amide (i.e., the C-terminus 
is -C(0)NH 2 ). Alternatively, use of a chloromethylated resin during peptide synthesis 
coupled with reaction with ammonia (to cleave the side chain protected peptide from the 
support) yields the free peptide amide and reaction with an alkylamine or a dialkylamine 
yields a side chain protected alkylamide or dialkylamide (i.e., the C-terminus is 
-C(0)NR 3 R 4 where R 3 and R 4 are as defined above). Side chain protecting groups are then 
removed in the usual fashion by treatment with hydrogen fluoride to give the free amides, 
alkylamides, or dialkylamides. 
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E. Other Modifications: 

One can also replace the naturally occurring side chains of the 20 genetically 
encoded amino acids (or the stereoisomeric D amino acids) with other side chains, for 
instance with groups such as alkyl, lower alkyl, cyclic 4-, 5-, 6- or 7-membered alkyl, 
5 amide, amide lower alkyl, amide di(lower alkyl), lower alkoxy, hydroxy, carboxy and the 
lower ester derivatives thereof, and 4-, 5-, 6- or 7-membered heterocyclic. In particular, 
proline analogues in which the ring size of the proline residue is changed from 5 members 
to 4, 6, or 7 members can be employed. 

One can also readily modify the peptides herein by phosphorylation or other 
10 methods as described in Hruby et al. (1 990) Biochem 1 268:249-262. Thus, the peptides of 
the invention also serve as structural models for non-peptidic compounds with similar 
biological activity. For example, the peptide backbones may be replaced with a backbone 
composed of phosphonates, amidates, carbamates, sulfonamides, secondary amines, and 
N-methylamino acids. 

15 

IV. Utility 

The compounds of the invention are useful in vitro as unique tools for 
understanding the biological role of G-CSF, including the evaluation of the many factors 
thought to influence, and be influenced by, the production of white blood cells. The 

20 present compounds are also useful in the development of other compounds that bind to 
G-CSFR, because the compounds provide important structure-activity relationship (SAR) 
information that facilitates that development. 

Moreover, based on the ability to bind to G-CSFR and related receptors, a 
compound of the invention can be used as a reagent for detecting a G-CSF receptor or 

25 related receptor on living ceils, fixed cells, in biological fluids, in tissue homogenates, in 
purified, natural biological materials, etc. For example, by labeling a compound of the 
invention, one can identify a cell expressing G-CSFR on its surface. In addition, based on 
it ability to bind a G-CSFR, a compound of the invention can be used in in situ staining, 
FACS (fluorescence-activated cell sorting), Western blotting, ELISA (enzyme-linked 

30 immunoadsorptive assay), etc. In addition, because of its ability to bind to a G-CSFR, a 
compound of the invention can be used in receptor purification or in purifying cells 
expressing G-CSFR on the cell surface (or inside permeabilized cells). 
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A compound of the invention can also be utilized as a commercial research 
reagent for various medical research and diagnostic uses. Such uses include but are not 
limited to: (1) use as a calibration standard for quantitating the activities of candidate 
G-CSFR antagonists or agonists in a variety of functional assays; (2) use as a blocking 
5 reagent in random peptide screening, i.e., in searching for new families of G-CSFR peptide 
ligands; (3) use in the co-crystallization with G-CSFR, i.e., a compound of the invention 
will allow formation of crystals bound to G-CSFR, enabling the determination of 
receptor/peptide structure x-ray crystallography; (4) use in inhibiting or decreasing the 
proliferation and growth of G-CSF-dependent cell lines; and (5) other research and 

10 diagnostic applications wherein the action of G-CSFR is to be mimicked, and the like. 

A compound of the invention can also be administered to a warm blooded 
animal, including a human, to treat a disease that would benefit from the ability of a 
compound to mimic the effects of G-CSF in vivo. Thus, the present invention encompasses 
methods for treating a patient who would benefit from a G-CSFR modulator, comprising 

1 5 administering to the patient a therapeutically effective amount of a compound of the 

invention to activate G-CSFR. For example, a compound of this invention will find use in 
the treatment of diseases such as a depressed neutrophil count. Although attributable to a 
myriad of causes, a depressed neutrophil count is commonly associated with 
chemotherapy, AIDS and pneumonia (particularly community-acquired pneumonia). 

20 Thus, it is preferred that a compound of the present invention be used to treat a depressed 
neutrophil count selected from the group consisting of chemotherapy-induced neutropenia, 
AIDS-induced neutropenia and community-acquired pneumonia-induced neutropenia. 

In addition, the invention encompasses methods for treating a patient who would 
benefit from a G-CSFR modulator, comprising administering to the patient a 

25 therapeutically effective amount of a compound of the invention that antagonizes the 
action of G-CSF to the G-CSFR in vivo. For example, these receptor antagonists are 
administered prior to and during chemotherapy to confer chemoprotection to the neutrophil 
progenitor cells by preventing their proliferation in the presence of cytotoxic drugs. Once 
chemotherapy administration is suspended, the administration of the chemoprotective G- 

30 CSFR antagonists is also suspended thereby allowing the patient's endogenous G-CSF to 
stimulate proliferation. Alternatively, the neutrophil progenitor cells may be "rescued" by 
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administration of G-CSF or by a G-CSF agonist, e.g., a compound of the present invention 
having G-CSF agonist activity. 

Accordingly, the invention includes pharmaceutical compositions comprising, as 
an active ingredient, at least one of the compounds of the invention in association with a 
5 pharmaceutical carrier or diluent. The composition can be administered by oral, parenteral 
(intramuscular, intraperitoneal, intravenous (TV) or subcutaneous) injection, transdermal 
(either passively or using iontophoresis or electroporation), or transmucosal (nasal, 
vaginal, rectal, or sublingual) routes of administration, or using bioerodible inserts, and 
can be formulated in dosage forms appropriate for each route of administration. 

10 Solid dosage forms for oral administration include capsules, tablets, pills, 

powders, and granules. In such solid dosage forms, the active compound is admixed with 
at least one inert pharmaceutically acceptable carrier such as sucrose, lactose, or starch. 
Such dosage forms can also comprise, as is normal practice, an additional substance other 
than an inert diluent, e.g., a lubricating agent such as magnesium stearate. In the case of 

15 capsules, tablets, and pills, the dosage forms may also comprise a buffering agent. Tablets 
and pills can additionally be prepared with enteric coatings. 

Liquid dosage forms for oral administration include pharmaceutically acceptable 
emulsions, solutions, suspensions and syrups, with the elixirs containing an inert diluent 
commonly used in the art, such as water. These compositions can also include one or 

20 more adjuvants, such as a wetting agent, an emulsifying agent, a suspending agent, a 
sweetening agent, a flavoring agent or a perfuming agent. 

Preparations for parenteral administration include sterile aqueous or 
non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents or 
vehicles are propylene glycol, polyethylene glycol, vegetable oils, such as olive oil and 

25 corn oil, gelatin, and injectable organic esters such as ethyl oleate. Such dosage forms may 
also contain one or more adjuvants such as a preserving agent, a wetting agent, an 
emulsifying agent and a dispersing agent. The dosage forms may be sterilized by, for 
example, filtration through a bacteria-retaining filter, by incorporating sterilizing agents 
into the compositions, by irradiating the compositions, or by heating the compositions. 

30 They can also be manufactured using sterile water, or some other sterile injectable 
medium, prior to use. 
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Compositions for rectal or vaginal administration are preferably suppositories 
which may contain, in addition to the active substance, an excipient such as cocoa butter or 
a suppository wax. Compositions for nasal or sublingual administration are also prepared 
with one or more standard excipients well known in the art. 

The dosage of active ingredient in the compositions of this invention may be 
varied; however, it is necessary that the amount of the active ingredient is such that a 
suitable dosage form is obtained. The selected dosage depends upon the desired 
therapeutic effect, the route of administration, the duration of the treatment desired, and 
other factors well known to those skilled in the art. Generally, dosage levels of between 
0.001 to 10 mg/kg of body weight daily are administered to mammals. 

It is to be understood that while the invention has been described in conjunction 
with the preferred specific embodiments thereof, that the foregoing description as well as 
the examples which follow are intended to illustrate and not limit the scope of the 
invention. Other aspects, advantages and modifications within the scope of the invention 
will be apparent to those skilled in the art to which the invention pertains. 
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Experimental 

The following .examples are put forth so as to provide those of ordinary skill in 
the art with a complete disclosure and description of how to prepare and use the 
compounds disclosed and claimed herein. Efforts have been made to ensure accuracy with 
respect to numbers (e.g., amounts, temperature, etc.) but some errors and deviations should 
be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in 
°C and pressure is at or near atmospheric. 

Standard peptide synthetic methods were used, and solid phase reactions were 
carried out at room temperature. Unless otherwise indicated, all starting materials and 
reagents were obtained commercially, e.g., from Aldrich, Sigma and ICN, and used 
without further purification. Standard cell culture and cell harvesting procedures were 
used. 

Also, in these examples and throughout this specification, the abbreviations 
employed have their generally accepted meanings, as follows: 
Ac = acetyl 

BSA = bovine serum albumin 
DMSO = dimethyl sulfoxide 
DTT = dithiothreitol 

HPLC = high pressure liquid chromatography 
MBP = maltose binding protein 
PBS = phosphate-buffered saline 

SDS PAGE = sodium dodecyl sulfate polyacrylamide gel electrophoresis 

TCEP = tris(2-carboxyethyl) phsophine 

TFA = trifluoroacetic acid 

Tris = tris[hydroxymethyl]aminomethane 

Examples 1-34 

G-CSF Competition Binding Assays 
The peptides of Table 1 were synthesized using standard techniques and were 
subsequently evaluated to identify whether the peptides exhibited specific and/or 
competitive binding. 

Specific binding is binding of a ligand to a specific receptor, as opposed to 
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non-specific binding that is mediated by non-specific interactions. Specific binding may 
be measured by subtraction of the non-specific binding (measured in the presence of 
saturating concentrations of unlabeled ligand) from the total binding (measured in the 
absence of saturating amounts of ligand). Typically, the unlabeled ligand used was a 
5 variant of G-CSF in which the cysteine normally found at position 17 was converted to 
serine (CS17). 

Determination of competitive binding was also carried out for a number of 
peptides. Briefly stated, G-CSFR was purified using standard techniques. The receptor 
was then immobilized in microtiter plate wells that were coated with acid-treated antibody 

10 (Abl79) specific for a site on G-CSFR not involved with G-CSF binding. Separately, I25 I 
was coupled to the natural ligand G-CSF using techniques well known in the art. Test 
peptides were added to receptor-coated wells and allowed to bind to immobilized receptor 
for approximately 30 minutes. 125 I labeled G-CSF was then introduced to the wells and 
incubated overnight at 4 °C. Unbound 125 I labeled G-CSF was removed by washing the 

1 5 plate several times followed by measuring the amount of radioactivity that remained in 
each well using conventional techniques. If no reduction in the amount of bound 125 I 
labeled G-CSF was detected, the peptide did not compete for binding to the receptor. 
Alternatively, if reduced amounts or no 125 I labeled G-CSF was detected, the peptide did 
compete. Non-positive results for a particular peptide are not dispositive of that peptide's 

20 activity: the peptide may exhibit binding under conditions different from those tested. 

The results of these assays reveal important information about the structure activity 
relationship for peptide and peptide mimetics of the invention to the G-CSF receptor. 



Table 1 



Ex. 
No. 


Sequence 


Specific 
Binding ? 


Competitive 
Binding? 


1 


CAGEVMHMCC (SEQ ID NO: 8) 


Yes 


Yes 


2 


CNREIEAMCC (SEQ ID NO: 9) 


Yes 


Yes 


3 


CADEVMHFCC (SEQ ID NO: 10) 


Yes 


Yes 


4 


CDVWQLFDRC (SEQ ID NO: 25) 


Yes 


Yes 


5 


CSFVQLNSIC (SEQ ID NO: 26) 


Yes 


Yes 


6 


CVPWMFYDLC (SEQ ED NO: 29) 


Yes 


No 
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Ex. 
No. 


Sequence 


Specific 
Binding ? 


Competitive 
Binding ? 


7 


CDPWMFYDLC (SEQ ID NO: 30) 


Yes 


No 


8 


CQRAGYMLAC (SEQ ID NO: 44) 


No 


No 


9 


CHANPVWGEC (SEQ ID NO: 45) 


No 


No 


10 


CTWTDLESVY (SEQ ID NO: 433) 


No 


No 


11 


CFWSDWGQTC (SEQ ID NO: 46) 


No 


No 


12 


CPDWYQSYMC (SEQ ID NO: 34) 


Yes 


Yes 


13 


CPHWTSYYMC (SEQ ID NO: 47) 


Yes 


Yes 


14 


CACMLRWHC (SEQ ID NO: 43) 


Yes 


Yes 


15 


CETLCGACFC (SEQ ID NO: 44) 


No 


No 


16 


SNESGWVWLP (SEQ ID NO: 110) 


Yes 


No 


17 


EQSNSGWVWV (SEQ ID NO: 1 1 1) 


Yes 


No 


18 


SRTESGWVWT (SEQ ID NO: 112) 


Yes 


No 


19 


QRANSGWVWV (SEQ ID NO: 1 13) 


Yes 


No ! 


20 


DYDNSGWVWH (SEQ ID NO: 1 14) 


Yes 


No 


21 


ETWGERDWFC (SEQ ID NO: 133) 


Yes 


Yes 


22 


STAERLWFCG (SEQ ID NO: 135) 


Yes 


Yes 


23 


YETAERSYFC (SEQ ID.NO: 119) 


Yes 


Yes 


24 


ADNAERGWFC (SEQ ID NO: 137) 


Yes 


Yes 


25 


QSNSEREWFC (SEQ ID NO: 1 38) 


Yes 


Yes 


26 


STSERAWFCG (SEQ ID NO: 139) 


Yes 


Yes 


27 


ASWSERGWFC (SEQ ID NO: 140) 


Yes 


Yes 


28 


ELSSEREWFC (SEQ ID NO: 141) 


Yes 


Yes 


29 


DMQGERGWFC (SEQ ID NO: 142) 


Yes 


Yes 


30 


DMVYAYPPWS (SEQ ID NO: 155) 


Yes 


No 


31 


DEMVYTVPYW (SEQ ID NO: 156) 


Yes 


Yes 


32 


HTTNEQFFMC (SEQ ID NO: 434 ) 


Yes 


Yes 


33 


DTWLELESRY (SEQ ID NO: 435) 


Yes 


No 
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Ex. 
No. 


Sequence 


Specific 
Binding ? 


Competitive 
Binding ? 


34 


DWQKTIPAYW (SEQ ID NO: 437) 


Yes 


Yes 



EXAMPLES 35-73 

G-CSF Radioligand Binding Assays 
5 The peptides of Table 2 were synthesized using standard techniques and were 

subsequently evaluated to determine their binding affinities to G-CSFR. 

Streptavidin-coated scintillation proximity assay (SPA) beads (Amersham) were 
mixed with biotinylated anti-receptor immobilizing antibody (AM 79) followed by 
incubation with soluble G-CSFR harvest. Receptor-coated SPA beads were washed twice 

10 in PBS /0. 1 % BSA and distributed to wells of a white polystyrene 96-well microtiter plate 
(Packard). Serial dilutions of peptide or peptide mimetic were mixed with a constant 
amount of I25 I labeled G-CSF (10 5 cpm; 1290 Ci/mmol) in PBS/0.1% BSA, added to wells 
containing receptor-coated SPA beads, and incubated overnight at 4 °C. The binding of 
radiolabeled G-CSF to the receptor-coated SPA bead brings the isotope in close proximity 

15 to the scintillant, which allows the emitted radiation to stimulate the scintillant to emit 

light. Any unbound radiolabeled ligand is not in close enough proximity to the scintillant 
to allow such energy transfer and hence no signal is generated. The amount of I25 I labeled 
G-CSF that was bound at equilibrium was measured by counting the plate in a TopCount 
(Wallac) microtiter plate luminometer. The assay is conducted over a range of peptide 

20 concentrations and the results are graphed such that the y-axis represents the amount of 
bound 125 I labeled G-CSF and the x-axis represents the concentration of peptide or peptide 
mimetic. One can determine the concentration at which the peptide or peptide mimetic 
will reduce by 50% QC 5Q ) the amount of 125 I labeled G-CSF bound to immobilized G- 
CSFR. The dissociation constant (KJ for the peptide should be similar to the measured 

25 IC 50 using the assay conditions described above. 

The peptides along with their corresponding IC 50 values are shown in Table 2. IC 50 
values are indicated symbolically by the symbols "+", and n ++". For examples, those 
peptides which showed IC 50 values in excess of 200 uM are indicated with a Those 
peptides which gave IC 50 values of less than or equal to 200 uM are given a "+", while 
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those which gave IC 50 values of 500 nM or less are indicated with a "++". Those peptides, 
which gave IC 50 values at or near the cutoff point for a particular symbol, are indicated 
with a hybrid designator, e.g., "+/-". The peptides for which IC 50 values were not 
determined are listed as "N.D.". 

The results of these assays reveal important information about the structure-activity 
relationship for peptide and peptide mimetics of the invention to the G-CSF receptor. 



Table 2 



Ex. 
No. 


Sequence 


ic 50 


35 


NH 2 -EQSNSGWVWV-CONH 2 (SEQ ID NO: 1 1 1) 


+ ' 


36 


NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135) 




37 


NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135) 
NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135) 


+ 


38 


NH 2 -QSNSEREWFC-CONH 2 (SEQIDNO: 138) 




39 


NH 2 -QSNSEREWFC-CONH 2 (SEQIDNO: 138) 
NH 2 -QSNSEREWFC-CONH 2 (SEQ ID NO: 138) 


_ 


40 


NH 2 -QSNSEREWFCG-CONH 2 (SEQ ID NO: 149) 


- 


41 


NH 2 -QSNSEREWFCG-CONH 2 (SEQ ID NO: 149) 
NH 2 -QSNSEREWF(!:G-CONH 2 (SEQIDNO: 149) 




42 


Ac-ESGWVW-CONH 2 (SEQ ID NO: 470) 




43 


Ac-NSGWVW-CONH 2 (SEQ ID NO: 471) 




44 


Ac-SGWVW-CONH 2 (SEQ ID NO: 472) 




45 


NH 2 -EQSNSGWVWVGGGGC-CONH 2 (SEQIDNO: 101) 


+ . 


46 


NH 2 -EQSNSGWVWVGGGGC-CONH 2 (SEQ ID NO: 101) 
NH 2 -EQSNSGWWVGGGGC-CONH 2 (SEQ ID NO: 101) 


+ 


47 


CESRLVECSRM (SEQ ID NO: 462) 


+/- 


48 


LAHCLLRLEECAAG (SEQ ID NO: 460) 


+/- 


49 


ALLMCESKLAECARAR (SEQ ID NO: 450) 


+/- 
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50 


DLWYLESKLEECARRANG (SEQ ID NO: 339) 
DLWYLESKLEECARRANG (SEQ ID NO: 339) 


+ 


51 


DLWYLESKLEECARRCNG (SEQ ID NO: 340) 


+ 


52 


DLWYLESKLEEAARRCNG (SEQ ID NO: 475) 

1 

DLWYLESKLEEAARRCNG (SEQ ID NO: 475) 


+ 


53 


LLDICELKLQECARRCN (SEQ ID NO: 208) 


++ 


54 


GGGLLDICELKLQECARRCN (SEQ ID NO: 209) 


++ 


55 


GRTGGGLLDICELKLQECARRCN (SEQ ID NO: 210) 


++ 


56 


LGIEGRTGGGLLDICELKLQECARRCN (SEQ ID NO: 21 1) 


++ 


57 


LLDICELKLQECARRAN (SEQ ID NO: 343) 


+ 


58 


LLDICELKLQEAARRCN (SEQ ID NO: 212) 




59 


Biotin-LLDICELKLQECARRAN (SEQ ID NO: 343) 


+ 


60 


Biotin-KLLDICELKLQEAARRCN (SEQ ID NO: 213) 




61 


LLDIAELKLQECARRCN (SEQ ID NO: 463) 


+ 


62 


Biotin-KLLDIAELKLQECARRCN (SEQ ID NO: 464) 


+ 


63 


Biotin-KGGGMLAERKAEERRWFNTHGRE 

(SEQ ID NO: 490) 


+ 


64 


MLAERKAEERRWFNTHGRE (SEQ ID NO: 377) 
MLAERKAEERRWFNTHGREK (SEQ ID NO: 378) 


+/- 


65 


CMLAERKAEERRWFNTHGRE (SEQ ID NO: 380) 
CMLAERKAEERRWFNTHGREK (SEQ ID NO: 381) 


N.D. 


66 


H 2 N-KSTGGLTAERDAEKRRWLLTHGGE-COOH 

(SEQ ID NO: 491) 


- 


67 


CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 465) 
CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 465) 


+ 


68 


LTAERDAEKRRWLLTHGGEGG (SEQ ID NO: 466) 
LTAERDAEKRRWLLTHGGEGGK (SEQ ID NO: 467) 


- 


69 


LTAERDAEKRRWLLTHGGEGGGGG (SEQ ID NO: 468) 
LTAERDAEKRRWLLTHGGEGGGGGK (SEQ ID NO: 469) 




70 


YLELCQLRLEECARQFN (SEQ ID NO: 282) 


+ 
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71 


CGCHVSPVQIKALC (SEQ ID NO: 198) 


+ 


72 


GCHVSPVQIKALC (SEQ ID NO: 199) 




73 


HELCETYADWLGCVEW (SEQ ID NO: 76) 


N.D. 



5 Examples 74-81 

Cell Proliferation and Luminescence Assays 
The bioactivity of selected peptides of the invention was measured in cell-based 
assays. Murine NFS-60 cells proliferate in the presence of G-CSF in a dose dependent 
manner and were used in standard cell proliferation assays that are well known in the art. 

10 Murine IL-3 dependent Ba/F3 cells were co-transfected with expression vectors encoding 
the full length human G-CSFR and a luciferase reporter gene controlled by the fos 
promoter. The Ba/F3 G-CSFR reporter cell line is not only dependent on the presence of 
G-CSF for proliferation, but also produces luciferase in response to the addition of G-CSF 
in a dose dependent manner. The parental, untransfected cell line does not respond to G- 

1 5 CSF or produce luciferase, but remains IL-3 dependent. 

Reporter cell assays were performed on the above cell line using peptides of the 
invention. The cells were maintained in complete RPMI-1640 media containing 10% fetal 
calf serum, 2 mM L-glutamine, IX antibiotic-antimycotic solution (Life Technologies), 
and 10% WEHI-3 conditioned media (source of murine IL-3). For reporter assays, cells 

20 were starved overnight in medium which lacks WEHI-3 to reduce luciferase expression to 
background levels. The cells were then washed twice in PBS, resuspended in media which 
lacks WEHI-3 conditioned media, and added to wells of a 96-well microliter plate 
containing dilutions of peptide or G-CSF at 5 x 10 4 cells/well. Plates were incubated for 2 
hours at 37 °C in a humidified 5% C0 2 incubator and luciferase activity was measured by 

25 the addition of luciferin (LucLite - Packard Biosciences) to each well. The plates were 
read in a TopCount (Wallac) microtiter plate luminometer. 

To measure the ability of selected peptides of the invention to block G-CSF 
mediated receptor activation, dilutions of peptide were combined with Ba/F3 G-CSFR 
reporter cells as described above. After a 30-minute incubation at 37 °C, G-CSF was 
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added to each well. The cells were incubated for 2 hours at 37 °C and the amount of 
luciferase produced was measured as described above. 

The following seven peptides were tested for bioactivity: 

Ex. 74 NH 2 -EQSNSGWVWV-CONH 2 (SEQ ID NO: 1 1 1); 

Ex. 75 NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 1 35); 

Ex. 76 NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135); 

NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 135); 

Ex. 77 QLETCVLKLEECARRCN (SEQ ID NO: 315); 

Ex. 78 LLDICELKLQECARRCN (SEQ ID NO: 208); 

Ex. 79 PLFSCELKKQECARRCN (SEQ ID NO: 323); and 

Ex. 80 DLWYLESKLEECARRCN (SEQ ID NO: 338). 

Examples 74, 75, and 76 showed antagonist activity at high concentrations in 
cell-based assays using NFS-60 cells. The stability of Example 74 in cell culture medium 
was tested by overnight incubation in NFS-60-conditioned medium; no loss of activity was 
observed, indicating that the peptide is stable to degradation under these conditions. 

Examples 77, 78, 79, and 80 showed cell proliferation activity when fused to the 
carboxy-terminus of the maltose binding protein (MBP). The MBP fusion protein of 
Example 78 in particular showed high affinity in a binding competition assay with 
,23 I-GCSF aCso -10 nM) and activity in a Ba/F3 G-CSFR cell proliferation assay 
(maximal activity at 100 nM). Parental Ba/F3 cells and Ba/F3 cells expressing the human 
thrombopoietin receptor did not proliferate in response to this fusion protein. Western blot 
analysis of the fusion protein revealed both monomeric and dimeric species, however the 
G-CSFR preferentially binds the dimeric molecule. This is true for most of the MBP 
fusions tested. Presumably the fusion protein is dimerized through intermolecular 
disulfide bonds between cysteine residues present in the peptide sequence. Cleavage of 
the peptide from the carboxy terminus of MBP using Factor Xa caused the peptide to lose 
its bioactivity while retaining its binding activity. 

The Ba/F3 G-CSFR reporter cell line was used to measure the potency of: 
Ex. 81 LLDICELKLQECARRCN (SEQ ID NO: 208) 
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and other possible G-CSF receptor antagonists. 

Ligand mediated G-CSF receptor activation in these cells results in the expression 
of Iuciferase, providing a detectable biological signal. Ba/F3 G-CSFR reporter cells 
responded to the addition of G-CSF in a dose dependent manner (Figure 2). The addition 
of increasing concentrations of peptide from Example 81 inhibit this G-CSF response, 
indicating that the peptide is a G-CSFR antagonist (Figure 3). 

Example 82 

Characterization of the Dimer Form of AF15846 

The peptide AF15846, i.e., LLDICELKLQECARRCN (SEQ ID NO: 208), was 
under study as a G-CSF antagonist for chemoprotection against chemotherapy-induced 
neutropenia. The peptide monomer contains three Cys residues with a mass of 2020.4 
(average). This peptide is not active as a monomer but must be oxidized, putatively to a 
dimer form, for activity. 

Monomer vs. dimer forms of AF15846: 

AF15846 that had been oxidized in 50 mM Tris, pH 8.0 for 48 hours was diluted 
with PBS, then injected onto a Superdex peptide gel filtration column equilibrated in PBS 
at 0.75 mL/min. The results of this chromatography indicated that most of the peptide was 
in dimer form, with small amounts of monomer remaining (not shown). In contrast, 
AF1 5846 that had been stored in acid and then diluted with PBS directly prior to injection 
onto the peptide column eluted predominantly as a monomer. Some dimerization 
apparently occurred either during storage or during the short period the peptide was at 
neutral pH prior to and during size exclusion chromatography. Oxidized peptide also 
eluted much later from a cation exchange column run in salt gradients at low pH, 
consistent with dimer formation (not shown). 

Reverse phase HPLC assay for oxidation of AF15846: 

AF 15846 was oxidized by incubation in 50 mM Tris, pH 8.0, for 16 to 48 hours. 
Reverse phase HPLC methods using a Vydac 25 cm C-18 column and 0.1% 
TFA/acetonitrile buffers were developed to separate the oxidized dimer from unoxidized 
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monomer, and to separate several different dimerized peptide structures. While both high 
pH reverse phase and cation exchange chromatography were also investigated, low pH 
reverse phase separation on a 25 cm column provided the best separation of the many 
oxidized forms of the peptide (not shown). The dimer species elute from the column with 
5 earlier retention times than do the monomer species. Samples of oxidized AF 1 5846 were 
re-reduced with DTT to confirm the elution order. One additional piece of evidence for 
the formation of intermolecular dimers comes from the fact that when oxidation was 
carried out at low (0.25 mg/mL) concentrations of peptide, the reaction apparently did not 
go to completion. 

10 

Oxidation of AF15846 under various conditions: 

, AF15846 was incubated for 48 hours in 50 mM Tris, pH 8, 20% DMSO in water, 
20 mM potassium phosphate, pH 3, or 0.1% TFA at room temperature. Aliquots of each 
sample were taken at various time points. Oxidation of Hie monomer peptide in Tris 
1 5 resulted in the presence of one major plus one minor oxidized species after several hours. 
In contrast, oxidation of the peptide in 20% DMSO in water resulted in a complex mixture 
of oxidized species, even after the 48 hour incubation. Some oxidation of the peptide was 
observed even at acidic pH, although to a much lesser extent than that observed with either 
Tris or DMSO as the oxidant. 

20 

Activity of oxidized AF15846 fractions: 

Several fractions containing oxidized AF15846 resulting from treatment under the 
conditions described above were collected subjected to testing in two assays: an 
125 I-G-CSF competition binding assay and an ELISA format competitive G-CSF 

25 receptor-binding assay. In both cases fractions corresponding to the predominant 

Tris-oxidized species exhibited the highest activity. The activity of selected fractions in 
the 125 I-G-CSF competition binding assay is shown in Figure 4. While species 
corresponding to the monomer peptide were inactive, matrix-assisted laser 
desorption/ionization mass spectrometry (MALDI-MS) confirmed that the active, Tris- 

30 oxidized species was a peptide dimer. 
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Determination of the disulfide structure of the active oxidized form of AF15846: 

It was hypothesized that the active form of AF1 5846 would contain one intrachain 
disulfide per peptide monomer and one interchain peptide dimer. The three possibilities 
for this type of structure are shown below 

H 3 N + -LLDICELKLQECARRCN-COO" (SEQ ID NO: 208) 



H 3 N + -LLDlCELKLQECAR^N-COO" (SEQ ID NO: 208); 
HjN^-LLDICELKLQECARRCN-COO- (SEQ ID NO: 208) 



H 3 N + -LLDi£xL^eLaRRCN-COO' (SEQ ID NO: 208); and 
H 3 N + -LLDICELKLQECARRCN-COO' (SEQ ID NO: 208) 



H^-LLDICELKLQECARRiN-COO- (SEQ ID NO: 208). 

To determine if one of these structures was present in the active form of AF15846, aliquots 
of Tris-oxidized AF15846 (not HPLC purified) were digested with trypsin and subjected to 
reverse phase HPLC. Trypsin digestion was carried out using an immobilized enzyme 
column from Perseptive Biosystems. Digestion was carried out in 25 mM Tris, pH 8, 5 
mM CaCl 2 . Fractions were eluted from the column directly into 0.1% TFA to lower the 
pH and minimize disulfide scrambling. The resulting tryptic fragments were separated by 
reverse phase HPLC and analyzed by MALDI mass spectrometry and Edman sequencing. 
In addition, an aliquot of the digest was analyzed by electrospray liquid 
chromatography/mass spectrometry (LC/MS). MALDI MS and sequencing of the tryptic 
peptides indicated the presence of peptides corresponding to disulfide bonds between Cys- 
5 and Cys-5, as well as between Cysl2 and Cys-12. This finding indicated that there were 
two interchain disulfide bonds between peptide monomers. This result was confirmed by 
the LC/MS data (Figure 5), which identified peptides identical to those found by MALDI 
MS. The typtic peptides are labeled, beginning with the first residue, i.e., Lys, as follows: 
Tl = residues 1-8; T2 = residues 9-14; Tl,2 = residues 1-14; T2,3 = residues 9-15; and "+" 
indicates a disulfide linkage between peptides. However, an additional minor species was 
evidently present, as a peptide corresponding to a disulfide bond between Cys-5 and Cys- 
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12, which could be either an intrachain or an interchain disulfide, was also seen, albeit at a 
lower level. 

To confirm that the active species contained at least two interchain disulfides, an 
aliquot of the HPLC-purified, Tris-oxidized AF 15846 shown to be active in competition 
5 assays was also digested with trypsin. The profile of the purified material was compared 
to that of the unfractionated Tris oxidation product (Figure 6, same labeling as in Figure 
5). The HPLC profile indicates that the purified material is lacking a peptide 
corresponding to a Cys-5 to Cys-12 disulfide-linked fragment. This indicated that the 
active species contains two interchain disulfide bonds. However, the oxidation state of the 

1 0 remaining Cys- 1 6 in each monomer was not determined. 

The oxidized peptide was also reacted with N-ethylmaleimide (NEM) at 37 °C for 1 
hour in 100 mM ammonium acetate, pH 4.1 to see if any free Cys residues remained in the 
molecule. If this were the case, treatment with the alkylating reagent would result in a 
shift of the HPLC retention time. Upon incubation with NEM, no such shift was seen 

1 5 (Figure 7). In contrast, when the oxidized peptide was incubated with the disulfide 

specific reducing agent TCEP, also in ammonium acetate, a shift to a later retention time, 
consistent with reduced peptide, was found. The reduced peptide was modified with NEM 
to produce a peptide that eluted even later than the reduced form. These data indicate that 
all six Cys residues in the AF1 5846 active dimer are involved in disulfide bonds. Since 

20 previous results showed that Cys-5 is linked to Cys-5 and Cys-12 is linked to Cys-12, it 

seems apparent that the remaining two Cys residues at position 16 of the monomer are also 
involved in an interchain disulfide bond. 

To obtain further information about the disulfide bond structure in active AF15846, 
the peptide was digested with Lys-C in 50 mM Tris pH 7.0/30% acetontrile. The profile of 

25 this digest is shown in Figure 8. Four major peaks are seen. The first peak corresponds to 
a dimer of residues 9-17, as indicated by the MALDI MS spectrum of this fraction. See 
Figures 9A and 9B. However, it is not possible to tell with this technique if all four Cys 
residues are involved in disulfide formation. The last peak contains a dimer of residues 1- 
8. The remaining two peaks represent intact peptide (22 min) and an artifact peak. This 

30 second digest clearly indicates that the peptide dimerizes into a parallel structure. 

This three parallel interchain disulfide structure, indicated below, is different than 
that originally predicted. Note that the arrows represent sites of cleavage by trypsin. 
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i i 

NH 3 + -LLDICELKLQECARRCN-COO- (SEQ ID NO: 208) 

NH 3 + -LLDI(tELKLQECARRCN-COO- (SEQ ID NO: 208) 
5 T.I 

AF15846 (dimerform) 
Incubation of the oxidized peptide at 37 °C at higher pH apparently resulted disulfide 
scrambling and/or degradation of the peptide as control peptide fractions incubated at pH 
6.0 or pH 7.5 in parallel with NEM-treated fractions exhibited complex HPLC patterns 
10 after incubation. It was necessary to drop to pH 4. 1 to obtain clean profiles upon NEM 
treatment. 



A bioassay for determining activity of G-CSF antagonists: ( 

A biosassay was used to measure the potency of AF15846 and other possible G- 
1 5 CSF receptor antagonists. This bioassay utilizes a Ba/F3 cell line containing the rhGCSF 
receptor and a c-fos promoter/luciferase gene construct (Ba/F3/rhGCSF-R/pFos-lcf). 
Competent binding of a ligand to the receptor results in expression of lucifierase as the 
biological readout. Addition of AF15846 to the assay results in the dose-response curve 
shifting to higher concentrations, indicating that the peptide is inhibiting the binding of G- 
20 CSF to the expressed receptor (Figures 10A and 10B). Conversely, the inclusion of 

various levels of peptide in the assay causes an increase in the amount of G-CSF required 
to produce a signal, also indicating that the peptide inhibits G-CSF binding (Figure 1 1). * 
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Claims 

What is claimed is: 

1 . A compound comprising a peptide chain approximately 10 to 40 amino acids in 
length that binds to G-CSFR and contains a sequence of amino acids of formula (I) 

5 (I) CX,Xpj^X 4 X 5 XjX 7 X,C (SEQ ID NO: 1) 

wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X, 
is A, N, S, F, D, G, L, T, E, V, P, Q, H, M or K; X 2 is M, G, R, H, D, I, V, A, S, E, N, F, 
Y, P, C, W or T; X 3 is E, V, W, F, M, A, N, S, L, T, Y, G or P; X 4 is V, I, G, Q, W, M, T, 
Y, L, P, D, C, E or A; Xj is M, E, W, L, P, N, I, T, V, F, Y, Q, S, R, W, G, H or D; X 6 is 
10 H, A, W, Y, V, F, Q, M, N, E, S, D, P or G; X 7 is M, F, Y, V, N, L, H, D, S, W, G, Q, C or 
T; and X 8 is C, Y, R, I, K, W, L, E, M, H, A, T, F, D, P, G or Q. 

2. The compound of claim 1 , wherein X, is D or P, X 2 is D or P, X 3 is E or W, X4 
is V, I or Y, X 5 is M or L, Xg is W, Y or F, X 7 is M, Y or D, and X 8 is C or M. 

15 

3. The compound of claim 1 , wherein the sequence of amino acids is selected from 
the group consisting of: 

CAGEVMHMCC (SEQ ID NO: 8); 

CNREIEAMCC (SEQ ID NO: 9); 
20 CADEVMHFCC (SEQ ID NO: 10); 

CNREIMWMCC (SEQ ID NO: 1 1); 

CSHEVWWYCC (SEQ ID NO: 12); 

CSREVL YYCC (SEQ ID NO: 1 3); 

CFIEGPWVCC (SEQ ID NO: 14); 
25 CFVEGNWYCC (SEQ ID NO: 15); 

CAAEVMVNCC (SEQ ID NO: 16); 

CSDEVIFYCC (SEQ ID NO: 17); 

CDREIMWFCC (SEQ ID NO: 18); 

CAHEVMWMCC (SEQ ID NO: 19); 
30 CGSEVTFMCC (SEQ ID NO: 20); 

CLEEIMWLCC (SEQ ID NO: 21); 
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CAREVLAMCC (SEQ ID NO: 22); 

CSVEVMQMCC (SEQ ID NO: 23); 

CTNVQLMHYC (SEQ ID NO: 24); 

CDVWQLFDRC (SEQ ID NO: 25); 
5 CSFVQLNSIC (SEQ ID NO: 26); 

CDYWQWFDKC (SEQ ID NO: 27); 

CESFWVELWC (SEQ ID NO: 28); 

CVPWMFYDLC (SEQ ID NO: 29); 

CDPWMFYDLC (SEQ ID NO: 30); 
10 CDPWVLFDEC (SEQ ID NO: 31); 

CDHWTYFDMC (SEQ ID NO: 32); 

CWWTLYDKC (SEQ ID NO: 33); 

CPDWYQSYMC (SEQ ID NO: 34); 

CPDWYSYYMC (SEQ ID NO: 35); 
1 5 CPEWYTDVMC (SEQ ID NO: 36); 

CPDWYLDYMC (SEQ ID NO: 37); 

CPEWYLDYMC (SEQ ID NO: 38); 

CPDWYLPYMC (SEQ ID NO: 39); 

CPEWYLPYMC (SEQ ID NO: 40); 
20 CQDWWVELWC (SEQ ID NO: 41); 

CPDWYLPWMC (SEQ ID NO: 42); 

CACMLRWHC (SEQ ID NO: 43); 

CQRAGYMLAC (SEQ ID NO: 44); 

CHANPVWGEC (SEQ ID NO: 45); 
25 CFWSDWGQTC (SEQ ID NO: 46); 

CPHWTSYYMC (SEQ ID NO: 47); 

CETLCGACFC (SEQ ID NO: 48); 

CATTINDTLC (SEQ ID NO: 49); 

CLNYPHPVFC (SEQ ID NO: 50); 
30 CMDGEMAVDC (SEQ ID NO: 51); 

CNMGWMSWPC (SEQ ID NO: 52); 

CETYAD WLGC (SEQ ID NO: 53); 
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CDPWMFFDMC (SEQ ID NO: 54); 
CDPWIWYDLC (SEQ ID NO: 55); 
CDPWIMYDRC (SEQ ID NO: 56); 
CDPWVFFDIC (SEQ ID NO: 57); 
CDPWTYYDLC (SEQ ID NO: 58); 
CDPWIFYDRC (SEQ ID NO: 59); 
CDP WLFYDLC (SEQ ID NO: 60); 
CDPWVWYDLC (SEQ ID NO: 61); 
CDPWIFFDRC (SEQ ED NO: 62); 
CDPWMFFDQC (SEQ ID NO: 63); 
CDP WLWYDRC (SEQ ID NO: 64); 
CDVWVWYDQC (SEQ ID NO: 65); 
CDPWIYYDLC (SEQ ID NO: 66); 
CVPWTLFDLC (SEQ ID NO: 67); 
CPAWYLEYMC (SEQ ID NO: 68); 
CPDWYLEYMC (SEQ ID NO: 69); 
CKYWQWFDKC (SEQ ID NO: 70); and 
CDHWMWYDKC (SEQ ID NO: 71). 



4. The compound of claim 1, wherein the sequence of amino acids is selected from 
the group consisting of: 

GCNREIEAMCCG (SEQ ID NO: 72); 
GCPEWYTDVMCG (SEQ ID NO: 73); 
NWYCMDGEMAVDCEAT (SEQ ID NO: 74); 
WQSCNMGWMSWPCYFV (SEQ ID NO: 75); 
HELCETYADWLGCVEW (SEQ ID NO: 76); 
PCDPWMFFDMCERW (SEQ ID NO: 77); 
LRGCDPWTWYDLCPAV (SEQ ID NO: 78); 
GYLCDPWDCYDRCLGF (SEQ ID NO: 79); 
RFACDPWVFFDICGYW (SEQ ID NO: 80); 
GYWCDPWTYYDLCLTA (SEQ ID NO: 81); 
MWTCDPWIFYDRCFLN (SEQ ID NO: 82); 
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GSSCDPWLFYDLCLLD (SEQ ID NO: 83); 
GGGCDPWVWYDLCWCD (SEQ ID NO: 84); 
YTSCDPWIFFDRCMSV (SEQ ID NO: 85); 
DPYCDPWMFFDQCAYL (SEQ ID NO: 86); 
5 REFCDPWLWYDRCL (SEQ ID NO: 87); 

NTGCDVWVWYDQCFAM (SEQ ID NO: 88); 
LVFCDPWIYYDLCMDT (SEQ ID NO: 89); 
GCSFVQLNSICG (SEQ ED NO: 90); 
GCPAWYLEYMCG (SEQ ID NO: 91); 
1 0 GCPDWYLEYMCG (SEQ ID NO: 92); 

GCKYWQWFDKCG (SEQ ID NO: 93); and 
GCDHWMWYDKCG (SEQ ID NO: 94). 



15 (VIII) 



5. The compound of claim 1, comprising a dimer having the structure of formula 



,<PA) n4 R 2 (PA) n2 



(VIII) / \ 

(Lk) x ^>(Lk) y 

N (pA) n3 - R 1 (pA) n1 

20 wherein R 1 and R 2 are independently selected from the sequences of amino acids of 

formula (J); PA is a p-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
one with the proviso that the sum of x and y is either one or two; and Lk is a terminal 
linking moiety selected from the group consisting of a disulfide bond, a carbonyl moiety, a 
C,., 2 linking moiety optionally terminated with one or two -NH- linkages and optionally 

25 substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 



6. A compound comprising a peptide chain approximately 9 to 40 amino acids in 
length that binds to G-CSFR and contains a sequence of amino acids of formula (II) 
30 (II) X'.X^SGWVWXVSEQIDNO^) 

wherein each amino acid is indicated by the standard one-letter abbreviation, and wherein 
X 1 , is S, Q, R, L or Y; X' 2 is N, S, T, A or D; X' 3 is E, D or N; and X' 4 is LVJ.P or H. 
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15 



20 



25 



7. The compound of claim 6, wherein the sequence of amino acids is selected from 
the group consisting of: 



SNESGWVWL (SEQ ID NO: 95); 
QSNSGWVWV (SEQ ID NO: 96); 
RTESGWVWT (SEQ ID NO: 97); 
RANSGWVWV (SEQ ID NO: 98); 
YDNSGWVWH (SEQ ID NO: 99); and 
LSDSGWVWVP (SEQ ID NO: 100). 

8. The compound of claim 6, comprising a dimer having the structure of formula 



wherein R 1 and R 2 are independently selected from the sequences of amino acids of 
formula (II); pA is a P-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
one with the proviso that the sum of x and y is either one or two; and Lk is a terminal 
linking moiety selected from the group consisting of a disulfide bond, a carbonyl moiety, a 
C,. 12 linking moiety optionally terminated with one or two -NH- linkages and optionally 
substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 

9. The compound of claim 8, wherein the dimer is: 
NH 2 -EQSNSGWVWVGGGGC-CONH 2 (SEQ ID NO: 101) 
NH 2 -EQSNSGWVWVGGGGC-CONH 2 (SEQ ID NO: 101); 

10. A compound comprising a peptide chain approximately 6 to 40 amino acids in 
length that binds to G-CSFR and contains a sequence of amino acids of formula (III) 

(III) ERX n 1 X u 2 X n 3 C (SEQ ID NO: 3) 
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wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X™, 
is D, L, S, G, E, A, K or Y; X n 2 is W, Y, F, L or V; and X" 3 is F, G, M or L. 

1 1. The compound of claim 10, wherein X n , is D or L, X n 2 is W, and X n 3 is F. 

5 

12. The compound of claim 10, wherein the sequence of amino acids is selected 
from the group consisting of: 

ERDWFC (SEQ ID NO: 120); 

ERDWGC (SEQ ID NO: 121); 
10 ERLWFC (SEQ ID NO: 122); 

ERSYFC (SEQ ID NO: 123); 

ERGWFC (SEQ ID NO: 124); 

EREWFC (SEQ ID NO: 125); 

ERAWFC (SEQ ID NO: 126); 
15 ERLYFC (SEQ ID NO: 127); 

ERYFMC (SEQ ID NO: 128); 

ERLFLC (SEQ ID NO: 129); 

ERALMC (SEQ ID NO: 130); 

ERDVMC (SEQ ID NO: 13 1); and 
20 ERKWFC (SEQ ID NO: 1 32). 

13. The compound of claim 10, comprising a dimer having the structure of 
formula (VIII) 

y (pA) n4 R 2 (pA) n2 

25 (VIII) / "V 

(Lk) x ^>Lk)y 

N (PA) n3 R 1 (pA) n1 

wherein R 1 and R 2 are independently selected from the sequences of amino acids of 
formula (111); PA is a P-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
30 one with the proviso that the sum of x and y is either one or two; and Lk is a terminal 

linking moiety selected from the group consisting of a disulfide bond, a carbonyl moiety, a 
C,. I2 linking moiety optionally terminated with one or two -NH- linkages and optionally 
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substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 

14. The compound of claim 13, selected from the group consisting of: 
NH 2 -STAERLWFCG-CONH 2 (SEQ ID NO: 1 35) 
NH 2 -STAERLWFC(}-CONH 2 (SEQ ID NO: 135); 

NH 2 -QSNSEREWFC-CONH 2 (SEQ ID NO: 138) 
NH^QSNSEREWFC-CONH, (SEQ ID NO: 138); and 



NH 2 -QSNSEREWFCG-CONH 2 (SEQ ID NO: 149) 
15 NH 2 -QSNSEREWFCG-CONH 2 (SEQ ID NO: 149). 

15. A compound comprising a peptide chain approximately 9 to 40 amino acids in 
length that binds to G-CSFR and contains a sequence of amino acids of formula (TV) 

(IV) X ra I MVYX m 2 X m 3PX in 4W (SEQ ID NO: 4) 
20 wherein each amino acid in indicated by standard one-letter abbreviation, and wherein X m , 
is D or E; X IU 2 is A or T; X m 3 is Y or V; and X m 4 is P or Y. 

16. The compound of claim 15, wherein the sequence of amino acids is selected 
from the group consisting of: 

25 DMVYAYPPW (SEQ ID NO: 1 53); and 

EMVYTVPYW (SEQ ID NO: 154). 

17. The compound of claim 1 5, comprising a dimer having the structure of 
formula (VIII) 

30 

/(P A) n4 — R*— ( P A) n2 
(Lk) x ^(Lk) y 
N (PA) n3 R 1 (PA) n1 
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wherein R 1 and R 2 are independently selected from the sequences of amino acids of 
formula (IV); pA is a P-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
one with the proviso that the sum of x and y is either one or two; and Lk is a terminal 
linking moiety selected from the group consisting of a disulfide bond, a carbonyl moiety, a 
5 C,. 12 linking moiety optionally terminated with one or two -NH- linkages and optionally 
substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 

18. A compound comprising a peptide chain approximately 12 to 40 amino acids 
10 in length that binds to G-CSFR and contains a sequence of amino acids of formula (V) 
(V) CX IV ,X IV 2 X IV 3X ,V 4 X ,V 5 X IV 6 X IV 7 X ,V 8 X IV 9 X IV )0 C (SEQ ID NO: 5) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X™, 
is E, G, P, N, R, T, W, S, L, H, A, Q or Y; X w 2 is S, T, E, A, D, G, W, P, L,N, V, Y, R or 
M; X™ is R, Y, V, Q, E, T, L, P, S, K, M, A or W; X IV 4 is L, M, G, F, W, R, S, V, P, A, D, 
15 C or T; X IV 5 is V, T, A, R, S, L, W, C, I, E, P, H, F, D or Q; X lw 6 is E, Y, G, T, Q, M, S, N, 
A or P; X IV 7 is C, V, D, G, L, W, E, V, I, S, M or A; X Iv g is S, Y, A, W, P, V, L, Q, G, K, 
F, I, E or D; X™, is R, W, M, D, H, V, G, A, Q, L, S, E or Y; X w i0 is M, L, I, S, V, P, W, 
F, T, Y, R, or Q. 

20 1 9. The compound of claim 1 8, wherein X IV , is E, X™ is S or A, X 1V 3 is R, X ,v 4 is 

L, X ,v 5 is V or S, X™ is E, X IV 7 is C, X 1 *, is S, X IV 9 is R, and X IV 10 is L. 

20. The compound of claim 18, wherein the sequence of amino acids is selected 
from the group consisting of: 
25 CESRLVECSRMC (SEQ ID NO: 1 57); 

CETYMTYVYWLC (SEQ ID NO: 158); 

CGERLAECARLC (SEQ ID NO: 159); 

CESRLRECSMLC (SEQ ID NO: 1 60); 

CEARLSECSRIC (SEQ ID NO: 161); 
30 CPARLLECSRMC (SEQ ID NO: 162); 

CESVGVGDWWSC(SEQIDNO: 163); 

CEDRLVEGPWVC (SEQ ID NO: 164); 
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CNDQFRTCVDVC (SEQ ID NO: 165); 

CRGEWWELYHPC (SEQ ID NO: 166); 

CEDTRTGWAWSC (SEQ ID NO: 167); 

CTWLSSGELVWC (SEQ ID NO: 168); 
5 CWPPVCEVSGIC (SEQ ID NO: 169); 

CSLSPIQLQHLC (SEQ ID NO: 170); 

CLARLEECSRFC (SEQ ID NO: 171); 

CHNSSPMVGVTC (SEQ ID NO: 172); 

CHVSPVQIKALC (SEQ ID NO: 173); 
10 CAAPATSWFQYC (SEQ ID NO: 174); 

CASKLHECSLRC (SEQ ID NO: 175); 

CEPMDSNGIVQC (SEQ ID NO: 176); 

CQYASAADEQRC (SEQ ID NO: 177); 

CEYWDEPSLSWC (SEQ ID NO: 178); 
1 5 CERECFQMLERC (SEQ ID NO: 1 79); 

CGMSTDELDEIC (SEQ ID NO: 180); 

CYVSPSTGLYSC (SEQ ID NO: 181); 

CEARLVECSRLC (SEQ ID NO: 182); 

CESRLSECSRMC (SEQ ID NO: 183); 
20 CELKLQECARRC (SEQ ID NO: 1 84); 

CELKLQEAARRC (SEQ ID NO: 185); and 

CLERLEECSRFC (SEQ ID NO: 186). 



21 . The compound of claim 1 8, comprising a dimer having the structure of 
25 formula (VIII) 

(VHI) / \ 

(Lk) x .><Lk) y 

N (pA) n3 R 1 (pA) n1 



30 



wherein R 1 and R 2 are independently selected from the sequences of amino acids of 
formula (V); pA is a p-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
one with the proviso mat the sum of x and y is either one or two; and Lk is a terminal 
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linking moiety selected from the group consisting of a disulfide bond, a caxbonyl moiety, a 
Cj., 2 linking moiety optionally terminated with one or two -NH- linkages and optionally 
substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 

22. The compound of claim 2 1 , having the structure: 
NH 3 + -LLDICELKLQECARRCN-COO- (SEQIDNO: 208) 
NH 3 + -LLDI(!;ELKLQEc!aRRCN-COO- (SEQ ID NO: 208). 



23. A compound comprising a peptide chain approximately 9 to 40 amino acids in 
length that binds to G-CSFR and contains a sequence of amino acids of formula (VI) 

(VI) X V 1 X V 2 X V 3 X V 4 X V 5 X V 6 CX V 7 X V 8 (SEQ ID NO: 6) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X v , 
15 is E, C, Q, V, or Y; X v 2 is E, A, L, M, S, W, or Q; X v 3 is K, R or T; X v 4 is L, A, or V; X v 5 
is R, A, M, H, E, V, L, G, D, Q, or S; X v 6 is E or V; X v 7 is A or G; X v 8 is R, H, G or L. 

24. The compound of claim 23, wherein X v , is E, X v 2 is A or L, X v 3 is K or R, X v 4 
is L, X v 6 is E, X v 7 is A, and X v 8 is R: 

20 

25. The compound of claim 23, wherein the sequence of amino acids is selected 
from the group consisting of: 

EEKLRECAR (SEQ ID NO: 214); 

EARLAECAR (SEQ ID NO: 215); 
25 CMKLMECAR (SEQ ID NO: 216); 

ELRLRECAH (SEQ ID NO: 217); 

EAKLHECAR (SEQ ID NO: 218); 

ELKLAECAR (SEQ ID NO: 219); 

EARLEECAR (SEQ ID NO: 220); 
30 EAKLRECAR (SEQ ID NO: 221); 

ELRLAECAR (SEQ ID NO: 222); 

ESRLAECAR (SEQ ID NO: 223); 



WO 02/07676 PCT7US0 1/23046 

-61- 



EAKLVECAR (SEQ ID NO: 224); 

ESRLRECAR (SEQ ID NO: 225); 

EAKLAECAR (SEQ ID NO: 226); 

QWRLEECAR (SEQ ID NO: 227); 
5 QLRLEECAR (SEQ ID NO: 228); 

ELRLEECAR (SEQ ID NO: 229); 

EAKLLECAR (SEQ ID NO: 230); 

EARAGVCAG (SEQ ID NO: 231); 

EAKAGVCAG (SEQ ID NO: 232); 
10 VARLEECAR (SEQ ID NO: 233); 

ELKLDECAR (SEQ ID NO: 234); 

EWRLQECAR (SEQ ID NO: 235); 

EAKLSECAR (SEQ ID NO: 236); 

EARLSECAR (SEQ ID NO: 237); 
15 ELKLLECAR (SEQ ID NO: 238); 

ELRLQECGR (SEQ ID NO: 239); 

EQKLAECAR (SEQ ID NO: 240); 

ELRLQECAR (SEQ ID NO: 241); 

ELKLEECAR (SEQ ID NO: 242); 
20 ESRLEECAR (SEQ ID NO: 243); 

EATVQECAR (SEQ ID NO: 244); 

ELKLQECAR (SEQ ID NO: 245); 

YSRLEECGR (SEQ ID NO: 246); 

ELRLRECAL (SEQ ID NO: 247); 
25 EARLLECAR (SEQ ID NO: 248); 

ESRLLECAR (SEQ ID NO: 249); 

VLKLEECAR (SEQ ID NO: 250); 

ESKLAECAR (SEQ ID NO: 251); 

ESKLRECAR (SEQ ID NO: 252); 
30 EYKLGECAR (SEQ ID NO: 253); 

ESRLQECAR (SEQ ID NO: 254); 

QARLAECAR (SEQ ID NO: 255); 
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ELKKQECAR (SEQ ID NO: 256); 
ESRLSECAR (SEQ ID NO: 257); 
EARLEECGR (SEQ ID NO: 258); 
ESRLAECGR (SEQ ID NO: 259); 
5 EWRLEECAR (SEQ ID NO: 260); 

EARLSECGR (SEQ ID NO: 261); 
AARLAECAR (SEQ ID NO: 262); 
EWKLAECAR (SEQ ID NO: 263); 
ESKLEECAR (SEQ ID NO: 264); 
10 DVKLAECAR (SEQ ID NO: 265); 

ELQLEECAR (SEQ ID NO: 266); and 
EYKLASCAR (SEQ ID NO: 267). 



26. The compound of claim 23, comprising a dimer having the structure of 
15 formula (VIII) 

(Lk) x >Lk) y 



S (pA) n3 R 1 (PA)^ 

20 wherein R 1 and R 2 are independently selected from the sequences of amino acids of 

formula (V); is a P-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
one with the proviso that the sum of x and y is either one or two; and Lk is a terminal 
linking moiety selected from the group consisting of a disulfide bond, a carbonyl moiety, a 
C un linking moiety optionally terminated with one or two -NH- linkages and optionally 

25 substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 



27. The compound of claim 26, selected from the group consisting of: 



30 [H]-DLWYLESKLEECARRANG-[NHJ (SEQ ID NO: 339) 

[H]-DLWYLESKLEE(!:ARRANG-[NH 2 ] (SEQ ID NO: 339); 
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[HJ-DLWYLESKLEECARRCNG -(NHJ (SEQ ID NO: 340); and 



[H]-LLDICELKLQECARRAN-[OH] (SEQ ID NO: 343). 



28. A compound comprising a peptide chain approximately 10 to 40 amino acids 
in length that binds to G-CSFR and contains a sequence of amino acids of formula (VII) 

(VII) X^.X^^X^X^EX ^%X™ X™ (SEQ ID NO: 7) 
wherein each amino acid is indicated by standard one-letter abbreviation, and wherein X 71 , 
is A, E or G; X^ is E, H or D; X^ is Ror G; X^ is K, Y, M, N, Q, R, D, I, S or E; X™, 
is A, S or P; X^ is E, D, T, Q, K or A: X^ is R, W, K, L, S, A or Q; X^ is R or E; and 
XVsW,G,orR. 

29. Die compound of claim 28, wherein the sequence of amino acids is selected 
from the group consisting of: 

AERKAEERRW (SEQ ID NO: 344); 
AERYAEEREG (SEQ ID NO: 345); 
AERMAEERRW (SEQ ID NO: 346); 
AERKAEERRR (SEQ ID NO: 347); 
AHRNAEERRW (SEQ ID NO: 348); 
AERKSEDWRW (SEQ ID NO: 349); 
AERKAEEKRR (SEQ ID NO: 350); 
AERQAETRRW (SEQ ID NO: 351); 
AERNAEERRW (SEQ ID NO: 352); 
AERQAEERRW (SEQ ID NO: 353); 
AERRAEERRW (SEQ ID NO: 354); 
AERDAEQRRW (SEQ ID NO: 355); 
AERIAEERRW (SEQ ID NO: 356); 
AERSAEERRW (SEQ ID NO: 357); 
AERKAEELRW (SEQ ID NO: 358); 
AERKAEESRW (SEQ ID NO: 359); 
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EERKAEERRW (SEQ ID NO: 360); 
ADGKAEERRW (SEQ ID NO: 361); 
ADGKAEELRW (SEQ ID NO: 362); 
ADGMPEERRW (SEQ ID NO: 363); 
ADGEAEKRRW (SEQ ID NO: 364); 
ADGNAEERRW (SEQ ID NO: 365); 
ADGEAEKARW (SEQ ID NO: 366); 
AEGEAEKARW (SEQ ID NO: 367); 
GERKAEERRW (SEQ ID NO: 368); 
AEREAEERRW (SEQ ID NO: 369); 
ADGEAEARRW (SEQ ID NO: 370); 
ADGRAEEARW (SEQ ID NO: 371); 
AEGRAEEARW (SEQ ID NO: 372); 
AEREAEKARW (SEQ ID NO: 373); 
AERKAEEQRW (SEQ ID NO: 374); 
AERDAEKRRW (SEQ ID NO: 375); and 
AEREAEKLRW (SEQ ID NO: 376). 



30. The compound of claim 28, comprising a dimer having the structure of 
formula (VIII) 

(VIE) 

(Lk) x ^>(Lk) y 
\pA) n3 R 1 (pA) n1 

wherein R 1 and R 2 are independently selected from the sequences of amino acids of 
formula (VI); PA is a p-alanine residue; nl, n2, n3, n4, x and y are independently zero or 
one with the proviso that the sum of x and y is either one or two; and Lk is a terminal 
linking moiety selected from the group consisting of a disulfide bond, a carbonyl moiety, a 
C,., 2 linking moiety optionally terminated with one or two -NH- linkages and optionally 
substituted at one or more available carbon atoms with a lower alkyl substituent, a lysine 
residue or a lysine amide. 
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3 1 . The compound of claim 30, wherein the dimer is selected from the group 
consisting of: 

MLAERKAEERRWFNTHGRE (SEQ ID NO: 377) 
MLAERKAEERRWFNTHGRE-KCNH^ (SEQ ID NO: 378) and 

CMLAERKAEERRWFNTHGRE (SEQ ID NO: 380) 

J, \ 
CMLAERKAEERRWFNTHGRE-K (SEQ ID NO: 381). 

32. The compound of any one of claims 1, 6, 10, 15, 18, 23 or 28, containing a 
disulfide bond. 

33. The compound of any one of claims 1, 6, 10, 15, 18, 23 or 28, wherein the N- 
terminus of the peptide is coupled to a polyethylene glycol molecule. 

34. The compound of any one of claims 1, 6, 10, 15, 18, 23 or 28, wherein the N- 
terminus of the peptide is acetylated. 

35. The compound of any one of claims 1, 6, 10, 15, 18, 23 or 28, wherein the C- 
tenninus of the peptide is amidated. 

36. A pharmaceutical composition comprising a therapeutically effective amount 
of the compound of any one of claims 1, 6, 10, 15, 18, 23 or 28, in combination with a 
pharmaceutically acceptable carrier. 

37. A method for treating a patient who would benefit from administration of a G- 
CSF modulator, comprising administering to the patient a therapeutically effective amount 
of the compound of any one of claims 1, 6, 10, 15, 18, 23 or 28. 

38. The method of claim 37, wherein the G-CSF modulator is an agonist for the G- 

CSFR. 
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39. The method of claim 38, wherein the patient suffers from a depressed 
neutrophil count. 

40. The method of claim 39, wherein the depressed neutrophil count is caused a 

5 condition selected from the group consisting of chemotherapy-induced neutropenia, AIDS- 
induced neutropenia and community-acquired pneumonia-induced neutropenia. 

41. A compound comprising a peptide chain approximately 6 to 40 amino acids in 
length that binds to G-CSF and contains a sequence of amino acids selected from the 

10 group consisting of: 

CTWTDLESVY (SEQ ID NO: 433); 

HTTNEQFFMC (SEQ ID NO: 434); 

DTWLELESRY (SEQ ID NO: 435); 

HNSSPMVGVT (SEQ ID NO: 436); 
15 DWQKTIPAYW(SEQIDNO:437); 

RWGREGLVAALL (SEQ ID NO: 438); 

WSGTRVWRCWT (SEQ ID NO: 439); 

MSLLSYLRS (SEQ ID NO: 440); 

LDLLAI (SEQ ID NO: 441); 
20 RIYGVK(SEQIDNO:442); 

MTWHMFMSLLF (SEQ ID NO: 443); 

FFWASWMHLLW (SEQ ID NO: 444); 

FDDCWREREQFLFQAL (SEQ ID NO: 445); 

CGRASECFRLLEM (SEQ ID NO: 446); 
25 RECFQMLER (SEQ ID NO: 447); 

CSIRWDFVPGYGLC (SEQ ID NO: 448); 

WMQCWDSLSLCYDM (SEQ ID NO: 449); 

ALLMCESKLAECARAR (SEQ ID NO: 450); 

LAHCKKRKEECAAG (SEQ ID NO: 451); 
30 SIDGVYLRTSRT (SEQ ID NO: 452); 

SIDGVYLRTRSRTRY (SEQ ID NO: 453); 

VWRLRGSTLRGLRD (SEQ ID NO: 454); 
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DRGGGTVGVYWWES Y (SEQ ID NO: 455); 
VWGTVGTWLEY (SEQ ID NO: 456); 
LMWVSAY (SEQ ID NO: 457); 
RASDEYGALVRFCTNL (SEQ ID NO: 458); 
NYWCDSNWVCEIA (SEQ ID NO: 459); 
LAHCLLRLEECAAG (SEQ ID NO: 460); 
LALCLARLRECAGG (SEQ ID NO: 461); 
CESRLVECSRM (SEQ ID NO: 462); 
LLDIAELKLQECARRCN (SEQ ID NO: 463); 
KLLDIAELKLQECCARRCN (SEQ ID NO: 464); 
CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 465) 
LTAERDAEKRRWLLTHGGEGG (SEQ ID NO: 466); 
LTAERDAEKRRWLLTHGGEGGK (SEQ ID NO: 467); 
LTAERDAEKRRWLLTHGGEGGGGG (SEQ ID NO: 468); 
LTAERDAEKRRWLLTHGGEGGGGGK (SEQ ID NO: 469); 
ESGWVW (SEQ ID NO: 470); 
NSGWVW (SEQ ID NO: 471); 
SGWVW (SEQ ID NO: 472); 
PLGKCEATCREMARYFN (SEQ ID NO: 473); 
SLQRCEYKLASVRGLCN (SEQ ID NO: 474) 
DLWYLESKLEEAARRCNG (SEQ ID NO: 475); 
PYMGTRSRAKLLRQQ (SEQ ID NO: 476): 
RNAGERRWFKTQGWY (SEQ ID NO: 477): 
MLAERNADDRRWFNTHGRD (SEQ ID NO: 478): 
MMADGRLRNSVGLILWCD (SEQ ID NO: 479); 
MLADGRLRNWG (SEQ ID NO: 480); 
LLADVRRRNGVGLLRMGRD (SEQ ID NO: 481); 
MLADGRLRNFGG (SEQ ID NO: 482); 
TYMTYVYWLC (SEQ ID NO: 483); (CORE 158) 
RFGERWGL (SEQ ID NO: 484); 
HWLWWGWNF (SEQ ID NO: 485); 
RECFQMLERC (SEQ ID NO: 486); 
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ILAHRNAKERRWFQKHGR (SEQ ID NO: 487); and 
CSTGGGLTAERDAEKRRWLLTHGGEK (SEQ ID NO: 489). 
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Figure 1-1 



CAGEVMHMCC (SEQ ID NO: 8) 
CNREIEAMCC (SEQ ID NO: 9) 
CADEVMHFCC (SEQ ID NO: 1 0) 
CNREIMWMCC (SEQ ID NO: 1 1) 
CSHEVWWYCC (SEQ ID NO: 12) 
CSREVLYYCC (SEQ ID NO: 13) 
CFIEGPWVCC (SEQ ID NO: 14) 
CFVEGNWYCC (SEQ ID NO: 15) 
CAAEVMVNCC (SEQ ID NO: 16) 
CSDEVIFYCC (SEQ ID NO: 17) 
CDREIMWFCC (SEQ ID NO: 18) 
CAHEVMWMCC (SEQ ID NO: 19) 
CGSEVTFMCC (SEQ ID NO: 20) 
CLEEIMWLCC (SEQ ID NO: 21) 
CAREVLAMCC (SEQ ID NO: 22) 
CSVEVMQMCC (SEQ ID NO: 23) 
CTNVQLMHYC (SEQ ID NO: 24) 
CDVWQLFDRC (SEQ ID NO: 25) 
CSFVQLNSIC (SEQ ID NO: 26) 
CDYWQWFDKC (SEQ ID NO: 27) 
CESFWVELWC (SEQ ID NO: 28) 
CVPWMFYDLC (SEQ ID NO: 29) 
CDPWMFYDLC (SEQ ID NO: 30) 
CDPWVLFDEC (SEQ ID NO: 31) 
CDHWTYFDMC (SEQ ID NO: 32) 
CWWTLYDKC (SEQ ID NO: 33) 
CPDWYQSYMC (SEQ ID NO: 34) 
CPDWYSYYMC (SEQ ID NO: 35) 
CPEWYTDVMC (SEQ ID NO: 36) 
CPDWYLDYMC (SEQ ID NO: 37) 
CPEWYLDYMC (SEQ ID NO: 38) 
CPD WYLPYMC (SEQ ID NO: 39) 
CPEWYLPYMC (SEQ ID NO: 40) 
CQDWWVELWC (SEQ ID NO: 41) 
CPDWYLPWMC (SEQ ID NO: 42) 
CACMLRWHC (SEQ ID NO: 43) 
CQRAGYMLAC (SEQ ID NO: 44) 
CHANPV WGEC (SEQ ID NO: 45) 
CFWSDWGQTC (SEQ ID NO: 46) 
CPHWTS YYMC (SEQ ID NO: 47) 
CETLCGACFC (SEQ ID NO: 48) 
CATONDTLC (SEQ ID NO: 49) 
CLNYPHPVFC (SEQ ID NO: 50) 
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Figure 1-2 



CMDGEMAVDC (SEQ ID NO: 51) 
CNMGWMSWPC (SEQ ID NO: 52) 
CETYADWLGC (SEQ ID NO: 53) 
CDPWMFFDMC (SEQ ID NO: 54) 
CDPWIWYDLC (SEQ ID NO: 55) 
CDPWIMYDRC (SEQ ID NO: 56) 
CDPWVFFDIC (SEQ ID NO: 57) 
CDPWTYYDLC (SEQ ID NO: 58) 
CDPWIFYDRC (SEQ ID NO: 59) 
CDPWLFYDLC (SEQ ID NO: 60) 
CDPWVWYDLC (SEQ ID NO: 61) 
CDPWIFFDRC (SEQ ID NO: 62) 
CDPWMFFDQC (SEQ ID NO: 63) 
CDP WLWYDRC (SEQ ID NO: 64) 
CDVWVWYDQC (SEQ ID NO: 65) 
CDPWIYYDLC (SEQ ID NO: 66) 
CVPWTLFDLC (SEQ ID NO: 67) 
CPAWYLEYMC (SEQ ID NO: 68) 
CPDWYLEYMC (SEQ ID NO: 69) 
CKYWQWFDKC (SEQ ID NO: 70) 
CDHWMWYDKC (SEQ ID NO: 71) 
GCNREIEAMCCG (SEQ ID NO: 72) 
GCPEWYTDVMCG (SEQ ID NO: 73) 
NWYCMDGEMAVDCEAT (SEQ ID NO: 74) 
WQSCNMGWMSWPCYFV (SEQ ID NO: 75) 
HELCETYADWLGCVEW (SEQ ID NO: 76) 
PCDPWMFFDMCERW (SEQ ID NO: 77) 
LRGCDPWIWYDLCPAV (SEQ ID NO: 78) 
GYLCDPWIFYDRCLGF (SEQ ID NO: 79) 
RFACDPWVFFDICGYW (SEQ ID NO: 80) 
GYWCDPWTYYDLCLTA (SEQ ID NO: 81) 
MWTCDPWIFYDRCFLN (SEQ ID NO: 82) 
GSSCDPWLFYDLCLLD (SEQ ID NO: 83) 
GGGCDPWVWYDLCWCD (SEQ ID NO: 84) 
YTSCDPWIFFDRCMSV (SEQ ID NO: 85) 
DPYCDPWMFFDQCAYL (SEQ ID NO: 86) 
REFCDPWLWYDRCL (SEQ ID NO: 87) 
NTGCDVWVWYDQCFAM (SEQ ID NO: 88) 
LVFCDPWIYYDLCMDT (SEQ ID NO: 89) 
GCSFVQLNSICG (SEQ ID NO: 90) 
GCPAWYLEYMCG (SEQ ID NO: 91) 
GCPDWYLEYMCG (SEQ ID NO: 92) 
GCKYWQWFDKCG (SEQ ID NO: 93) 
GCDHWMWYDKCG (SEQ ID NO: 94) 
SNESGWVWL (SEQ ID NO: 95) 
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Figure 1-3 

QSNSGWVWV (SEQ ID NO: 96) 
RTESGWVWT (SEQ ID NO: 97) 
RANSGWVWV (SEQ ID NO: 98) 
YDNSGWVWH (SEQ ID NO: 99) 
LSDSGWVWVP (SEQ ID NO: 100) 
EQSNSGWVWVGGGGC (SEQ ID NO: 101) 
CEQSNSGWVWV (SEQ ID NO: 102) 
EQSNSGWVWVGGGGCKKK (SEQ ID NO: 103) 
EQSNSGWVWVGKKKC (SEQ ID NO: 104) 
EQSNSGWVWVGKKK (SEQ ID NO: 105) 
KKKEQSNSGWVWV (SEQ ID NO: 106) 
EQSNSGWVWVGKKKSKKK (SEQ ID NO: 107) 
EQSNSGWVWVGGCKKK (SEQ ID NO: 108) 
EQSNSGWVWVGGGGGGCKKK (SEQ ID NO: 1 09) 
SNESGWVWLP (SEQ ID NO: 1 10) 
EQSNSGWVWV (SEQ ID NO: 1 1 1) 
SRTESGWVWT (SEQ ID NO: 1 12) 
QRANSGWVWV (SEQ ID NO: 1 13) 
DYDNSGWVWH (SEQ ID NO: 1 14) 
EQSNSGWVWVGKKKK (SEQ ID NO: 1 15) 
EQSNSGWVWVGGGGSKKK (SEQ ID NO: 116) 
EQSNSGWVWVGGGGS (SEQ ID NO: 117) 

EQSNSGWVWVGGGGSEQSNSGWVWVGGGGS (SEQ ID NO: 118) 

RYQSFELSDSGWVWVPVARH (SEQ ID NO: 1 19) 

ERDWFC (SEQ ID NO: 120) 

ERDWGC (SEQ ID NO: 121) 

ERLWFC (SEQ ID NO: 122) 

ERSYFC (SEQ ID NO: 123) 

ERGWFC (SEQ ID NO: 124) 

EREWFC (SEQ ID NO: 125) 

ERAWFC (SEQ ID NO: 126) 

ERLYFC (SEQ ID NO: 127) 

ERYFMC (SEQ ID NO: 128) 

ERLFLC (SEQ ID NO: 129 

ERALMC (SEQ ID NO: 130) 

ERDVMC (SEQ ID NO: 131) 

ERKWFC (SEQ ID NO: 132) 

ETWGERDWFC (SEQ ID NO: 133) 

ETWGERDWGC (SEQ ID NO: 134) 

STAERLWFCG (SEQ ID NO: 135) 

YETAERSYFC (SEQ ID NO: 136) 

ADNAERGWFC (SEQ ID NO: 137) 

QSNSEREWFC (SEQ ID NO: 138) 

STSERAWFCG (SEQ ID NO: 139) 

ASWSERGWFC (SEQ ID NO: 140) 
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Figure 1-4 

ELSSEREWFC (SEQ ID NO: 141) 
DMQGERGWFC (SEQ ID NO: 142) . 
SSSERAWFCG (SEQ ID NO: 143) 
GNMRERLYFC (SEQ ID NO: 144) 
QPNRERYFMC (SEQ ID NO: 145) 
SVTRERLFLC (SEQ ED NO: 146) 
IPLSERALMCSSWNC (SEQ ID NO: 147) 
WARSERDVMCLSYVC (SEQ ID NO: 148) 
QSNSEREWFCG (SEQ ID NO: 149) 
QSNSEREWFCGGGGS (SEQ ID NO: 150) 
NLEEALAQERLWFCRSGNC (SEQ ID NO: 151) 
NLESYEMEERKWFCKMFSC (SEQ ID NO: 152) 
DMVYAYPPW (SEQ ID NO: 153) 
EMVYTVPYW (SEQ ID NO: 154) 
DMVYAYPPWS (SEQ ID NO: 155) 
DEMVYTVPYW (SEQ ID NO: 156) 
CESRLVECSRMC (SEQ ID NO: 157) 
CETYMTYVYWLC (SEQ ID NO: 158) 
CGERLAECARLC (SEQ ID NO: 159) 
CESRLRECSMLC (SEQ ID NO: 160) 
CEARLSECSRIC (SEQ ID NO: 161) 
CPARLLECSRMC (SEQ ID NO: 162) 
CESVGVGDWWSC (SEQ ID NO: 163) 
CEDPJLVEGPWVC (SEQ ID NO: 164) 
CNDQFRTCVDVC (SEQ ID NO: 165) 
CRGEWWELYHPC (SEQ ID NO: 166) 
CEDTRTGWAWSC (SEQ ID NO: 167) 
CTWLSSGELVWC (SEQ ID NO: 168) 
CWPPVCEVSGIC (SEQ ID NO: 169) 
CSLSPIQLQHLC (SEQ ID NO: 170) 
CLARLEECSRFC (SEQ ID NO: 171) 
CHNSSPMVGVTC (SEQ ID NO: 172) 
CHVSPVQIKALC (SEQ ID NO: 173) 
CAAPATSWFQYC (SEQ ID NO: 174) 
CASKLHECSLRC (SEQ ID NO: 175) 
CEPMDSNGIVQC (SEQ ID NO: 176) 
CQYASAADEQRC (SEQ ID NO: 177) 
CEYWDEPSLSWC (SEQ ID NO: 178) 
CERECFQMLERC (SEQ ID NO: 179) 
CGMSTDELDEIC (SEQ ID NO: 180) 
CYVSPSTGLYSC (SEQ ID NO: 181) 
CEARLVECSRLC (SEQ ID NO: 182) 
CESRLSECSRMC (SEQ ID NO: 183) 
CELKLQECARRC (SEQ ID NO: 184) 
CELKLQEAARRC (SEQ ID NO: 185) 
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Figure 1-5 

CLERLEECSRFC (SEQ ID NO: 186) 
GGCESRLVECSRMC (SEQ ID NO: 187) . 
GGCETYMTYVYWLC (SEQ ID NO: 188) 
EWLCESVGVGDWWSC (SEQ ID NO: 1 89) 
YHPCEDRLVEGPWVCCRS (SEQ ID NO: 190) 
WLLCNDQFRTCVDVCDNV (SEQ ID NO: 191) 
IAECRGEWWELYHPCLAA (SEQ ID NO: 192) 
TWYCEDTRTGWAWSCLEL (SEQ ID NO: 193) 
QLDCTWLSSGELVWCSDW (SEQ ID NO: 194) 
QFDCTWLSSGELVWCSDW (SEQ ID NO: 195) 
CWPPVCEVSGICS (SEQ ID NO: 196) 
CGCSLSPIQLQHLC (SEQ ID NO: 197) 
CGCHVSPVQIKALC (SEQ ID NO: 198) 
GCHVSPVQIKALC (SEQ ID NO: 199) 
GTSCAAPATSWFQYCVLP (SEQ ID NO: 200) 
RMDCASKLHECSLRCAYA (SEQ ID NO: 201) 
GWCEPMDSNGIVQCSMR (SEQ ID NO: 202) 
IDVCQYASAADEQRCLRI (SEQ ID NO: 203) 
NVLCEYWDEPSLSWCLSS (SEQ ID NO: 204) 
CQCERECFQMLERC (SEQ ID NO: 205) 
FCSCGMSTDELDEICAIW (SEQ ID NO: 206) 
EEVCYVSPSTGLYSCYDQ (SEQ ID NO: 207) 
LLDICELKLQECARRCN (SEQ ID NO: 208) 
GGGLLDICELKLQECARRCN (SEQ ID NO: 209) 
GRTGGGLLDICELKLQECARRCN (SEQ ID NO: 210) 
LGIEGRTGGGLLDICELKLQECARRCN (SEQ ID NO: 21 1) 
LLDICELKLQEAARRCN (SEQ ID NO: 212) 
KLLDICELKLQEAARRCN (SEQ ID NO: 213) 
EEKLRECAR (SEQ ID NO: 214) 
EARLAECAR (SEQ ID NO: 215) 
CMKLMECAR (SEQ ID NO: 216) 
ELRLRECAH (SEQ ID NO: 217) 
EAKLHECAR (SEQ ID NO: 218) 
ELKLAECAR (SEQ ID NO: 219) 
EARLEECAR (SEQ ID NO: 220) 
EAKLRECAR (SEQ ID NO: 221) 
ELRLAECAR (SEQ ID NO: 222) 
ESRLAECAR (SEQ ID NO: 223) 
EAKLVECAR (SEQ ID NO: 224) 
ESRLRECAR (SEQ ID NO: 225) 
EAKLAECAR (SEQ ID NO: 226) 
QWRLEECAR (SEQ ID NO: 227) 
QLRLEECAR (SEQ ID NO: 228) 
ELRLEECAR (SEQ ID NO: 229) 
EAKLLECAR (SEQ ID NO: 230) 
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Figure 1-6 

EARAGVCAG (SEQ ID NO: 231) 
EAKAGVCAG (SEQ ED NO: 232) 
VARLEECAR (SEQ ID NO: 233) 
ELKLDECAR (SEQ ID NO: 234) 
EWRLQECAR (SEQ ID NO: 235) 
EAKLSECAR (SEQ ID NO: 236) 
EARLSECAR (SEQ ID NO: 237) 
ELKLLECAR (SEQ ID NO: 238) 
ELRLQECGR (SEQ ID NO: 239) 
EQKLAECAR (SEQ ID NO: 240) 
ELRLQECAR (SEQ ID NO: 241) 
ELKLEECAR (SEQ ID NO: 242) 
ESRLEECAR (SEQ ID NO: 243) 
EATVQECAR (SEQ ID NO: 244) 
ELKLQECAR (SEQ ID NO: 245) 
YSRLEECGR (SEQ ID NO: 246) 
ELRLRECAL (SEQ ID NO: 247) 
EARLLECAR (SEQ ID NO: 248) 
ESRLLECAR (SEQ ID NO: 249) 
VLKLEECAR (SEQ ID NO: 250) 
ESKLAECAR (SEQ ID NO: 251) 
ESKLRECAR (SEQ ID NO: 252) 
EYKLGECAR (SEQ ID NO: 253) 
ESRLQECAR (SEQ ID NO: 254) 
QARLAECAR (SEQ ID NO: 255) 
ELKKQECAR (SEQ ID NO: 256) 
ESRLSECAR (SEQ ID NO: 257) 
EARLEECGR (SEQ ED NO: 258) 
ESRLAECGR (SEQ ID NO: 259) 
EWRLEECAR (SEQ ID NO: 260) 
EARLSECGR (SEQ ID NO: 261) 
AARLAECAR (SEQ ID NO: 262) 
EWKLAECAR (SEQ ID NO: 263) 
ESKLEECAR (SEQ ID NO: 264) 
DVKLAECAR (SEQ ID NO: 265) 
ELQLEECAR (SEQ ID NO: 266) 
EYKLASCAR (SEQ ID NO: 267) 
RLSICEEKLRECARGC (SEQ ID NO: 268) 
PLTTCEARLAECARQL (SEQ ID NO: 269) 
LALCMKLMECARRY (SEQ ID NO: 270) 
ELVMCELRLRECAHRA (SEQ ID NO: 271) 
PLARCEAKLHECARQL (SEQ ED NO: 272) 
LLSVCELKLAECARSK (SEQ ID NO: 273) 
RLEWCEARLEECARRC (SEQ ID NO: 274) 
RLRWEAKLRECARGR (SEQ ID NO: 275) 
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Figure 1-7 

CVAHLELRLAECARQI (SEQIDNO: 276) 
HLARCESRLAECARQL (SEQ ID NO: 277) 
RLALLEAKLVECARJRL (SEQ ID NO: 278) 
DLFSLESRLRECARRV (SEQ ID NO: 279) 
AVPVLEAKLAECARRF (SEQ ID NO: 280) 
YLQQLQWRLEECARGM (SEQ ID NO: 281) 
YLELCQLRLEECARQFN (SEQ ID NO: 282) 
ELfflCELRLEECARGR (SEQ ID NO: 283) 
RVARCELRLAECARKS (SEQ ID NO: 284) 
YLEVLESRLAECARWK (SEQ ID NO: 285) 
EAKLLECARAR (SEQ ID NO: 286) 
ELSLCEARAGVCAGSVTK (SEQ ID NO: 287) 
ELSLCEAKAGVCAGSVTK (SEQ ID NO: 288) 
ALWQCVARLEECARSR (SEQ ID NO: 289) 
CLKSCELKLDECARRM (SEQ ID NO: 290) 
ALQTCEWRLQECARSR (SEQ ID NO: 291) 
YISQCEAKLAECARLY (SEQ ID NO: 292) 
ELSSCEAKLSECARRW (SEQ ID NO: 293) 
ELSSCEARLSECARRW (SEQ ID NO: 294) 
QLLQCELKLLECARQG (SEQ ID NO: 295) 
ELLRCEARLAECARGC (SEQ ID NO: 296) 
QLRQCELRLQECGRHGN (SEQ ID NO: 297) 
PLTSCEQKLAECARRF (SEQ ID NO: 298) 
LLGMCELRLQECARAK (SEQ ID NO: 299) 
ELSRCELKLEECARGM (SEQ ID NO: 300) 
DCRPCESRLEECARRL (SEQ ID NO: 301) 
RLSVCEARLEECARQL (SEQ ID NO: 302) 
PLKMCEATVQECARLI (SEQ ID NO: 303) 
LLLFCEARLSECARHV (SEQ ID NO: 304) 
SLSMCEARLAECARLL (SEQ ID NO: 305) 
PLFSCELKLQECARRCN (SEQ ID NO: 306) 
SLERCYSRLEECGRRI (SEQ ID NO: 307) 
PLTSCELRLRECALRSN (SEQ ID NO: 308) 
KLAACELKLAECARRW (SEQ ID NO: 309) 
KLAACELRLAECARRW (SEQ ID NO: 310) 
ALTRCELRLAECARKI (SEQ ID NO: 3 1 1) 
LLQQCELKLAECARSI (SEQ ID NO: 312) 
QLWQCEARLLECARRS (SEQ ID NO: 313) 
RLRLCESRLLECARSL (SEQ ID NO: 314) 
QLETCVLKLEECARRCN (SEQ ID NO: 315) 
ALSQCELRLAECARSVTK (SEQ ID NO: 316) 
ELKLAECARRS (SEQ ID NO: 317) 
ALSRCESKLAECARRQ (SEQ ID NO: 318) 
LMSTCESKLRECARSL (SEQ ID NO: 319) 
SLQRCEYKLGECARSL (SEQ ID NO: 320) 
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RLELLESRLQECARQLN (SEQ ID NO: 321) 
QMEWCQARLAECARCCN (SEQ ID NO: 322) 
PLFSCELKKQECARRCN (SEQ ID NO: 323) 
LLDKCESRLSECARRL (SEQ ID NO: 324) 
LLARCEARLEECGRQC (SEQ ID NO: 325) 
DLLYCESRLAECGRM (SEQ ID NO: 326) 
ALQMCEWRLEECARRL (SEQ ID NO: 327) 
LLTMCEARLSECGRRL (SEQ ID NO: 328) 
ALWRCESRLAECARRS (SEQ ID NO: 329) 
LLATCAARLAECARQL (SEQ ID NO: 330) 
LQTCEWKLAECARSN (SEQ ID NO: 331) 
PLRSCESKLEECARQL (SEQ ID NO: 332) 
CLRALDVKLAECARHL (SEQ ID NO: 333) 
RLKTLELQLEECARRS (SEQ ID NO: 334) 
KLRDVELKLAECARRS (SEQ ID NO: 335) 
SLQRCEYKLASCARSL (SEQ ID NO: 336) 
RLARCELRLAECARKS (SEQ ID NO: 337) 
DLWYLESKLEECARRCN (SEQ ID NO: 338) 
DLWYLESKLEECARRANG (SEQ ID NO: 339) 
DLWYLESKLEECARRCNG (SEQ ID NO: 340) 
KQRELELKLAECARRS (SEQ ID NO: 341) 
QMQEWCARLAECARCCN (SEQ ID NO: 342) 
LLDICELKLQECARRAN (SEQ ID NO: 343) 

AERKAEERRW (SEQ ID NO: 344) 

AERYAEEREG (SEQ ID NO: 345) 

AERMAEERRW (SEQ ID NO: 346) 

AERKAEERRR (SEQ ID NO: 347) 

AHRNAEERRW (SEQ ID NO: 348) 

AERKSEDWRW (SEQ ID NO: 349) 

AERKAEEKRR (SEQ ID NO: 350) 

AERQAETRRW (SEQ ID NO: 351) 

AERNAEERRW (SEQ ID NO: 352) 

AERQAEERRW (SEQ ID NO: 353) 

AERRAEERRW (SEQ ID NO: 354) 

AERDAEQRRW (SEQ ID NO: 355) 

AERIAEERRW (SEQ ID NO: 356) 

AERSAEERRW (SEQ ID NO: 357) 

AERKAEELRW (SEQ ID NO: 358) 

AERKAEESRW (SEQ ID NO: 359) 

EERKAEERRW (SEQ ID NO: 360) 

ADGKAEERRW (SEQ ID NO: 361) 

ADGKAEELRW (SEQ ID NO: 362) 

ADGMPEERRW (SEQ ID NO: 363) 

ADGEAEKRRW (SEQ ID NO: 364) 

ADGNAEERRW (SEQ ID NO: 365) 
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Figure 1-9 

ADGEAEKARW (SEQ ID NO: 366) 
AEGEAEKARW (SEQ ID NO: 367) 
GERKAEERRW (SEQ ED NO: 368) • 
AEREAEERRW (SEQ ID NO: 369) 
ADGEAEARRW (SEQ ID NO: 370) 
ADGRAEEARW (SEQ ED NO: 371) 
AEGRAEEARW (SEQ ID NO: 372) 
AEREAEKARW (SEQ ID NO: 373) 
AERKAEEQRW (SEQ ID NO: 374) 
AERDAEKRRW (SEQ ID NO: 375) 
AEREAEKLRW (SEQ ID NO: 376) 
MLAERKAEERRWFNTHGRE (SEQ ID NO: 377) 
MLAERKAEERRWFNTHGREK (SEQ ID NO: 378) 
GGGMLAERKAEERRwTNTHGRE (SEQ ID NO: 379) 
CMLAERKAEERRWFNTHGRE (SEQ ID NO: 380) 
CMLAERKAEERRWFNTHGREK (SEQ ID NO: 381) 
MLAERYAEEREGFNMQWRE (SEQ ID NO: 382) 
MLAERMAEERRWFRRMG (SEQ ID NO: 383) 
IVAERKAEERRRLNTEGHE (SEQ ID NO: 384) 
ILAHRNAEERRWFQKHGR (SEQ ID NO: 385) 
MLAERKSEDWRWLKTHGRD (SEQ ID NO: 386) 
MLAERKAEEKRRLKTQGRE (SEQ ID NO: 387) 
ILAERQAETRRWMRNAGSVTK (SEQ ID NO: 388) 
MLAERNAEERRWLKRQCG (SEQ ID NO: 389) 
MLAERQAEERRWLKMHGGE (SEQ ID NO: 390) 
MLAERRAEERRWLKTQGGD (SEQ ID NO: 391) 
MLAERQAEERRWLKTQGRD (SEQ ID NO: 392) 
MLAERKAEERRWFKTHGRE (SEQ ID NO: 393) 
ML AERKAEERRWFNNQ GRE (SEQ ID NO: 394) 
MPAERDAEQRRWLKTHGRE (SEQ ED NO: 395) 
ILAERIAEERRWLKTQGR (SEQ ID NO: 396) 
MLAERKAEERR WLQTHGRE (SEQ ID NO: 397) 
ILAERSAEERRWLKTQGRE (SEQ ID NO: 398) 
LLAERKAEELRWLKTHGRE (SEQ ID NO: 399) 
MLAERKAEERR WLQTHGRE (SEQ ID NO: 400) 
MLAERNAEERRW (SEQ ID NO: 401) 
MFAERKAEESRWLQSQGRE (SEQ ID NO: 402) 
MLEERKAEERRWLKTHGR (SEQ ID NO: 403) 
MLAERKAEERRWLKMQGRE (SEQ ID NO: 404) 
MLAERNAEERRWFYTHGRE (SEQ ID NO: 405) 
MLADGKAEERRWLKTHGLD (SEQ ID NO: 406) 
MIADGKAEERRWLKTHGRD (SEQ ID NO: 407) 
MLADGKAEELRWLKTQGSD (SEQ ID NO: 408) 
MLAERNAEERRWLKTHGRD (SEQ ID NO: 409) 
MLADGKAEELRWLKTQGRE (SEQ ID NO: 410) 
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Figure 1-10 

ILADGKAEERRWLKTHGRD (SEQ ID NO: 41 1) 
MLADGMPEERRWLQTHGRD (SEQ ID NO: 412) 
MLADGEAEKRRWLNTHGRD (SEQ ID NO: 413) 
MLADGNAEERRWLMTHGRD (SEQ ID NO: 414) 
MLADGEAEKARWLKTQGRE (SEQ ID NO: 415) 
MLAEGEAEKARWLKTQGRE (SEQ ID NO: 416) 
MLADGKAEERRWLKTQGRE (SEQ ID NO: 417) 
MLAERKAEERRWLSAHVRE (SEQ ID NO: 418) 
LLGERKAEERRWYKTHARE (SEQ ID NO: 419) 
MLAERKAEERRWLMTHGHD (SEQ ID NO: 420) 
MLAERKAEERRWLKSQCLE (SEQ ID NO: 421) 
LLAEREAEERRWFKTHGRE (SEQ ID NO: 422) 
MLADGEAEARRWFNMHGRE (SEQ ID NO: 423) 
MLADGRAEEARWLKTQGSE (SEQ ID NO: 424) 
MLAEGRAEEARWLKTQGSE (SEQ ED NO: 425) 
MLAEREAEKARWLKTQGRE (SEQ ID NO: 426) 
MMAERKAEEQRWFDIHGRD (SEQ ID NO: 427) 
LTAERDAEKRRWLLTHGGE (SEQ ID NO: 428) 
MLAERQAEERRWLKSQRGE (SEQ ID NO: 429) 
LLAERKAEERRWFATHGRD (SEQ ID NO: 430) 
MLAEREAEKLRWLKSQERA (SEQ ID NO: 431) 
MLAERKAEERRWLKTHGGE (SEQ ID NO: 432) 
CTWTDLESVY (SEQ ID NO: 433) 
HTTNEQFFMC (SEQ ID NO: 434) 
DTWLELESRY (SEQ ID NO: 435) 
HNSSPMVGVT (SEQ ID NO: 436) 
DWQKTIPAYW (SEQ ID NO: 437) 
RWGREGLVAALL (SEQ ID NO: 438) 
WSGTRVWRCWT (SEQ ID NO: 439) 
MSLLSYLRS (SEQ ID NO: 440) 
LDLLAI (SEQ ID NO: 441) 
RIYGVK (SEQ ID NO: 442) 
MWHMFMSLLF (SEQ ID NO: 443) 
FFWASWMHLLW (SEQ ID NO: 444) 
FDDCWREREQFLFQAL (SEQ ID NO: 445) 
CGRASECFRLLEM (SEQ ID NO: 446) 
RECFQMLER (SEQ ID NO: 447) 
CSIRWDFVPGYGLC (SEQ ID NO: 448) 
WMQCWDSLSLCYDM (SEQ ID NO: 449) 
ALLMCESKLAECARAR (SEQ ID NO: 450) 
LAHCKKRKEECAAG (SEQ ID NO: 451) 
SEDGV YLRTSRT (SEQ ID NO: 452) 
SIDGVYLRTRSRTRY (SEQ ID NO: 453) 
VRWLRGSTLRGLRDR (SEQ ID NO: 454) 
DRGGGTVGVYWWESY (SEQ ID NO: 455) 
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Figure 1-11 

VWGTVGTWLEY (SEQ ID NO: 456) 
LMWVSAY (SEQ ID NO: 457) . 
RASDEYGALVRFCTNL (SEQ ID NO: 458) 
NYWCDSNWVCEIA (SEQ ID NO: 459) 
LAHCLLRLEECAAG (SEQ ID NO: 460) 
LALCLARLRECAGG (SEQ ID NO: 461) 
CESRLVECSRM (SEQ ID NO: 462) 
LLDIAELKLQECARRCN (SEQ ID NO: 463) 
KLLDIAELKLQECARRCN (SEQ ID NO: 464) 
CSTGGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 465) 
LTAERDAEKRRWLLTHGGEGG (SEQ ID NO: 466) 
LTAERDAEKRRWLLTHGGEGGK (SEQ ID NO: 467) 
LTAERDAEKRRWLLTHGGEGGGGG (SEQ ID NO: 468) 
LTAERDAEKRRWLLTHGGEGGGGGK (SEQ ID NO: 469) 
ESGWVW (SEQ ID NO: 470) 
NSGWVW (SEQ ID NO: 471) 
SGWVW (SEQ ID NO: 472) 
PLGKCEATCREMARYFN (SEQ ID NO: 473) 
SLQRCEYKLASVRGLCN (SEQ ID NO: 474) 
DLWYLESKLEEAARRCNG (SEQ ID NO: 475) 
PYMGTRSRAKLLRQQ (SEQ ID NO: 476) 
RNAGERRWFKTQGWY (SEQ ID NO: 477) 
MLAERNADDRRWFNTHGRD (SEQ ID NO: 478) 
MMADGRLRNSVGLILWCD (SEQ ID NO: 479) 
MLADGRLRNWG (SEQ ID NO: 480) 
LLADVRRRNGVGLLRMGRD (SEQ ID NO: 481) 
MLADGRLRNFGG (SEQ ID NO: 482) 
TYMTYVYWLC (SEQ ID NO: 483) 
RFGERWGL (SEQ ID NO: 484) 
HWLWWGWNF (SEQ ID NO: 485) 
RECFQMLERC (SEQ ID NO: 486) 
ILAHRNAKERRWFQKHGR (SEQ ID NO: 487) 
CSTGGGLTAERDAEKRRWLLTHGGEK (SEQ ID NO: 489) 
KGGGMLAERKAEERRWFNTHGRE (SEQ ID NO: 490) 
KSTGGLTAERDAEKRRWLLTHGGE (SEQ ID NO: 491) 
EQSNSGWVWVGGGGCKKKC (SEQ ID NO: 492) 
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Figure 2 
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Figure 3 
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Figure 4 
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figure 5 
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Figure 6 
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Figure 7 
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Figure 8 



oou ■ 
550 - 








450 - 








f 360 - 

<D 

E 250 - 

• MB 

h" 

150 - 








50 - 




I A _ A I 




-50 - 


w 


C 


» 5 10 


i 1 r- 

15 20 25 


30 



Absorbance at 210 nm 



WO 02/07676 



PCT/US01/23046 



19/21 

Figure 9A 
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Figure 10A 
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